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I. 


INTRODUCTION 


This  thesis  presents  an  introduction  to  Hidden  Markov  models  (HMM)  and  their 
applications  to  classification  problems.  HMMs  have  been  used  extensively  to  model  the 
temporal  stracture  and  variability  of  speech  and  other  signals  in  the  last  decade.  For 
example,  they  have  become  a  major  player  in  the  speech  area  [4,  5,  9].  We  selected  to 
write  our  own  HMM  implementation  even  though  sophisticated  HMM  software  is  readily 
available  on  the  market  to  better  understand  the  basic  concepts  behind  the  theory.  Our 
implementation  uses  MATLAB  because  it  is  a  high-level  language  easy  to  work  with. 
As  a  result,  the  software  may  not  be  as  fast  as  that  obtained  with  a  compiled 
implementation  but  it  is  easy  to  understand,  which  was  one  of  the  main  goals  of  the 
research.  We  tested  our  software  on  a  limited  isolated  4-word  recognition,  and  we  also 
applied  our  implementation  to  the  recognition  of  mine-like  objects  buried  in  shallow 
sand,  using  seismo-acoustic  data  obtained  from  an  on-going  project  headed  by  Prof.  Muir 
from  the  Physics  Department  at  the  Naval  Postgraduate  School.  Finally,  we 
benchmarked  our  results  against  those  obtained  using  a  back-propagation  neural  network 
implementation. 

Chapter  n  introduces  the  concepts  of  HMMs  in  an  “engineering-oriented”  simple 
fashion.  We  illustrate  the  theoretical  concepts  with  basic  examples  to  facilitate  the 
understanding  of  this  difficult  topic.  We  cover  the  three  specific  problems  which  HMM 
address,  their  solutions,  emphasizing  the  computational  savings  obtained  with  the 
forward,  backward,  Viterbi  and  Baum-Welch  algorithms. 


1 


Chapter  in  presents  the  application  of  HMMs  to  a  simple  speech  classification 
problem,  using  four  isolated  three-syllable  words:  “Microsoft,”  “Statistics,”  “Instructor” 
and  “Professor.”  Our  goal  is  to  show  how  a  generic  classifier  can  be  set-up  through  the 
simple  speech  recognition  example.  We  introduce  the  concept  of  vector  quantization 
(VQ)  applied  to  generate  discrete  symbols  from  the  speech  feature  vectors  created  with 
Linear  Prediction  Coefficients  (LPC)  and  energy  coefficients.  Two  different 
implementations  of  VQ  are  considered  1)  a  Neural  Network  (NN)-based  VQ  scheme;  and 
2)  a  K-means  algorithm  [9].  In  addition,  we  point  out  the  potential  numerical  difficulties 
which  can  be  encountered  while  setting  up  and  implementing  the  HMM  software. 

Chapter  IV  considers  the  application  of  the  HMM-based  classifier  to  the 
classification  of  two  mine-like  objects  buried  in  sand;  a  cylinder  and  a  powder  keg  with 
weights  ranging  from  71kg  to  290kg.  Results  show  that  recognition  performance  is  good 
under  various  conditions  of  weight  and  distance  of  the  object. 

Chapter  V  presents  a  back-propagation  neural  network  classifier  designed  to 
recognize  the  two  mine-like  objects  discussed  in  Chapter  IV.  This  implementation  was 
conducted  on  the  same  data  to  compare  the  performance  of  the  two  schemes.  Results 
show  the  performances  to  be  similar  (around  95%),  but  the  HMM-based  implementation 
procedure  is  faster  than  the  NN-based  implementation. 
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11.  INTRODUCTION  TO  HIDDEN  MARKOV  MODELS 

This  chapter  introduces  the  basic  theory  of  Hidden  Markov  models  (HMM), 
which  are  a  very  powerful  technique  for  modeling  the  temporal  structure  and  variability 
in  speech  and  other  applications.  The  understanding  of  HMM  is  very  important  in  order 
to  use  them  properly  and  achieve  the  best  results  for  signal  classification. 

A.  HMM  BACKGROUND 

Hidden  Markov  Model  theory  was  first  introduced  in  the  late  1960s  and  early 
1970s  by  Baker  at  Camegie-Mellon  University  and  Jeninek  and  colleagues  at  IBM  for 
speech  recognition  [6].  Since  then  they  have  been  used  extensively  for  speech 
applications,  and  also  successfully  to  other  tasks  such  as  human  face  identification,  lip 
and  speech-reading,  optical  character  recognition  and  time  DNA  modeling  ([6]  and 
references  therein).  The  reason  for  this  wide  range  of  applications  is  the  rich 
mathematical  structure  the  HMMs  are  built  on,  yielding  optimal  results  if  used  properly. 
A  detailed  overview  of  HMM  theory  was  presented  by  Rabiner  in  the  late  1980s  [3,4]. 

There  are  three  main  reasons  why  we  may  need  to  model  a  signal.  First,  we  apply 
signal  modeling  to  mathematically  describe  the  signal,  so  that  we’ll  be  able  to  process  it, 
for  example  to  denoise  a  speech  signal.  Models  are  also  important  because  they  let  us 
describe  the  signal  source,  which  doesn’t  have  to  be  directly  available  to  the  user.  For 
example  we  cannot  produce  real  seismic  waves  without  an  earthquake,  but  we  can  create 
models  of  such  signals  and  then  process  them.  Finally,  the  most  important  reason  for  the 
widely  spread  use  of  signal  models  is  that  they  perform  well  in  practice  [3]. 

There  are  two  types  of  signal  models;  deterministic  and  statistical.  Deterministic 
modeling  is  applied  when  dealing  with  signals  with  known  physical  characteristics.  For 
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example  a  sinewave  is  completely  specified  by  its  frequency,  phase  and  amplitude. 
Stochastic  modeling  is  applied  when  one  tries  to  characterize  only  the  statistical 
properties  of  the  signal.  For  example,  statistical  modeling  may  include  Gaussian  Poisson 
and  Markov  process  to  describe  events.  Usually,  real  applications  use  both  deterministic 
and  stochastic  modeling.  In  this  work  we  focus  on  statistical  signal  modeling,  using  the 
Hidden  Markov  Model  (HMM). 

B.  INTRODUCTION 

The  HMM  theory  is  based  on  the  Markov  Chain.  We  can  define  the  Markov 
Chain  as  a  probabilistic  description  of  transitions  between  a  system’s  states.  A  state  can 
be  a  property,  or  generally  a  condition,  that  a  system/model  might  have  at  a  particular 
instance.  The  HMM  consists  of  an  underlyining  Markov  chain  describing  the 
probabilistic  status  between  the  states,  as  that  shown  in  Figure  2.1,  which  illustrates  a 
three  state  left-right  model.  For  example,  suppose  we  want  to  model  a  speech  signal. 
First,  the  signal  is  split  into  T  time  frames.  Then,  a  set  of  parameters  (such  as  LPC 
coefficients,  energy,  etc...)  is  extracted  together  with  a  set  of  symbols  for  each  time 
frame.  The  sets  of  symbols  represent  each  frame  characteristics,  and  are  called 
observations.  As  a  result,  the  entire  model  is  a  sequence  of  symbols,  and  each  symbol  is 
a  system  model  depicting  each  segment.  In  most  cases  we  choose  the  segment  length 
empirically,  but  sometimes  adjust  it  so  that  it  is  large  enough  to  contain  all  the 
information  (usually  spectral)  that  makes  it  unique,  by  comparison  with  the  other 
segments,  due  to  possible  change  in  the  signal  behavior.  Generally,  we  can  assume  that 
we  reach  an  optimal  number  of  frames  when,  by  decreasing  the  frame  length,  we  generate 
models  identical  to  those  already  generated.  At  this  point,  HMMs  take  advantage  of  the 
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properties  existing  between  adjacent  segments  properties  by  addressing  the  following 
three  problems  [3]:  1)  how  to  identify  the  characteristic  frames;  2)  how  to  characterize 
the  relation  between  all  successive  segments;  and  3)  what  types  of  properties  should  be 
extracted  to  model  each  segment. 

C.  DISCRETE  MARKOV  PROCESS 

An  accurate  definition  for  the  HMM  according  to  Rabiner  [3]  is  that  “A  HMM  is 
a  doubly  stochastic  process  that  is  not  observable  (it  is  hidden),  but  can  only  be  observed 
through  another  set  of  stochastic  processes  that  produce  the  sequence  of  observed 
symbols.” 

Consider  a  system  which  exists  at  time  t  in  one  of  N  possible  potential  states,  as 
illustrated  in  Figures  2.1  and  2.2,  where  N=3.  Each  of  the  three  circles  represents  a  state 
of  the  model.  At  a  specific  discrete  time  instant  t  within  frame  k,  the  model  is  always  at 
one  of  those  three  states,  and  we  observe  the  sequence  Ok.  Generally,  given  that  the 
system  has  N  states  S  =  {si,  S2,  .  .  .  ,Sn},  where  Sj  is  the  state,  at  every  time  t=k,  the 
model  passes  through  the  sequences  of  states  Q  =  {qi,  qa,  •  •  •  ,qt},  where  qk  is  the  state  at 
time  k.  The  model  that  describes  all  this  information  is  called  a  Markov  chain.  As  we 
can  see  from  the  example  in  Figure  2.1, 

~^k+l’  (2.1) 

which  means  that  if  the  model  at  time  t=k  is  at  state,  it  can  remain  at  the  same  state  at 
time  t=k+l.  Note  that  no  backward  transitions  are  allowed  in  Fig  2.1.  As  a  result,  this 
model  is  called  a  “left-to-right”  model  (the  model  is  called  ergotic  when  backward 
transitions  are  allowed,  as  illustrated  in  Fig  2.2). 
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We  define  the  probability  ajj  as  the  transition  probability  from  state  i  to  state  j. 
For  example,  a23  is  the  probability  of  going  from  the  2““*  to  the  3^^  state,  and  is  defined  as: 

aij  =  P[qt  =  j\qt  -x  =  i],  \<i,j  <N ,  (2.2) 

with  the  following  constrains: 


ay>0  Vi,  7, 
2ay  =  l  Vi, 

;=i 


(2.3) 


since  all  probabilities  are  positive  numbers  and  the  summation  of  all  transition 
probabilities  is  1. 
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ail  &22  ^33 


an  a22 


Figure  2.2  A  Markov  Process  (Chain),  ergotic  model 
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For  the  discrete-time  case,  a  system  governed  by  known  or  predictable  dynamics 
can  be  modeled  using  a  Markov  chain.  The  probability  of  observing  the  observation  Ot, 
given  the  model  X  at  a  time  t,  is  determined  by  the  state  at  the  time  preceding  states  [6]: 

P{Ot\X)  =  P\qt  =  Sj,qt-\  =  Si,  qt  -  7  =  Sk, . .  (2.4) 

For  a  first-order  Markov  chain,  eq  (2.4)  can  be  simplified  to: 

is»<r.  (2.5) 

ifc=l 

Finally,  we  define  the  initial  conditions,  the  probabilities  of  starting  (t=l)  at  the  i*  state  Si 
as: 

ni  =  P[q,=s,\  i  =  (2.6) 

1.  Example  1 

Let’s  evaluate  the  probability  of  the  observation  O  =  {si,  Si,  S3}  for  the  example 
shown  in  Figure  2.2.  So,  applying  eqs  (2.4)  and  (2.5)  to  the  example  shown  in  Figure  2.1 
leads  to: 

Pr(0|A)  =  p\{si  Si,  S3 7|2] 

Pt{0\X)  =  P[sJiS2]p[s2\sMs>] 

Pt{0\X)=  71^022012- 

2.  Example  2:  Browser  Tracking. 

Next,  let  us  consider  the  following  example  where  a  user  wishes  to  track  the 
browser  type  used  by  visitors  to  the  Web.  Assume:  1)  only  three  types  of  browsers  are 
available:  Microsoft  Explorer  (MIE),  Netscape  (NET)  and  America  Online’  s  browser 
(AOL);  2)  the  specific  browser  type  version  is  not  taken  into  account. 
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Table  2.1  Browser  Tracking 


User# 

1 

2 

3 

4 

Browser 

MS  Explorers  .0 

AOL4.0 

Netscape3.0 

Netscapeb.l 

Browser 

Observation 

1 

2 

3 

3 

Symbol 

An  observation  symbol  is  assigned  to  each  visitor’s  browser  types,  as  illustrated  in  Table 
2.1.  Note  that  we  assign  the  same  symbol  (3)  to  both  versions  of  Netscape,  as  the 
specific  browser  version  in  not  tracked  by  the  user.  In  addition  we  assume  transition 
probabilities  which  represent  probabilities  of  going  from  one  state  to  another,  i.e., 
browser,  to  another  are  available  from  previous  statistical  studies: 


a  = 


0.3  0.4  0.3 
0.2  0.5  0.3 
0.8  0.1  0.1 


Thus,  the  resulting  observation  sequence  is  0={  1,  2,  3,  3}.  Finally,  let’s  assume  that  the 
probability  the  first  user  uses  the  explorer  browser  is  7ti=0.5. 

Therefore: 

P{0 1 X)  =  P[l,  2,  3,  3 1  /I]  =  p[i]f[2  1 1]P[3 1 2]p[3 1 3]  = 

=>  P(0 1  A)  =  0.5  •  0.2  •  0.1  •  0.1  =  0.001. 

3.  Example  3:  Urn-ball  Example 

Note  that  the  states  are  directly  observable  in  the  previous  examples.  However,  in 
most  real  world  cases,  the  state  sequence  that  produces  a  given  sequence  of  patterns 
cannot  be  determinated,  and  the  model  is  said  to  be  “hidden”.  The  most  common 
example  for  this  type  of  description  is  the  um-ball  model,  according  to  Fig  2.3: 
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Figure  2.3.  The  classic  Um-ball  example.  There  are  three  0^=3)  urns  that  contain  a  large 
number  of  color  balls.  The  colors  of  the  balls  are  red,  green,  and  blue.  The  boy  we  assigned  to 
paint  the  balls  runs  out  of  blue  paint  sometime  while  painting  and  uses  sky  blue  afterwards.  Thus, 
eventhough  there  are  4  colors,  we  assign  the  same  symbol  for  blue  and  sky  blue  and  the  number  of 
symbols  is  three  (M=3).  Each  time  a  person  randomly  selects  an  urn,  he  selects  a  ball,  and 
announces  the  color.  The  process  is  repeated  4  (T=4)  times.  Note  that:  1)  we  don’t  know  from 
which  urn  the  ball  came  from,  as  we  only  have  a  color  sequence,  and  we  map  those  colors  with  the 
symbols  according  from  Table,  2.2;  2)  we  assume  that  it  is  a  left-to-right  model  for  simplicity. 


Table  2.2.  Um-ball  example.  T,  Q,  C,  O  values  parameters 


Clock  Time  T 

1 

2 

3 

4 

Um  (Hidden  State)  Q 

qi 

qi 

q2 

qs 

Color  C 

Red 

Blue 

Sky  Blue 

Green 

Observation  Symbol  Os 

1 

2 

2 

3 

Table  2.2  shows  that  we  have  T=4  observations  of  four  balls.  The  number  of  urns 
is  N=3  and  each  um  represents  one  of  the  3  hidden  states,  because  we  don’t  know  from 
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the  specific  um  number  each  ball  comes  from.  Further,  we  consider  that  the  sky  blue 
color  belongs  to  the  same  class  as  blue,  so  we  assign  the  same  observation  symbol  (Os=2) 
for  both  colors,  and  the  total  number  of  symbols  is  M=3.  The  possible  observation 
symbols  are  V={  1,  2,  3},  and  the  states  are  Q  =  {qi,  qi,  qs}-  We  also  assign  values  to  the 
transition  probability  matrix  A={aij},  and  initial  conditions  vector  n={7ti}: 


A  —  {l3y}— 


0.5  0.3  0.2 
0  0.8  0.2 
0  0  1 


n  =  {;;z;}=  {0.9,  0.1,  0}. 


Note  that  the  specific  upper  triangular  structure  of  A  indicates  the  model  considered  here 
is  left-to-right,  as  we  cannot  go  backward.  In  addition,  we  assume  that  the  probability  of 
selecting  the  first  um  is  90%.  Finally,  we  also  need  the  observation  symbol  probability 
distribution  at  every  state,  to  describe  the  model  completely.  This  information  is 
presented  in  the  matrix  B,  which  contains  the  probabilities  of  being  at  the  state  j  and 
observing  the  symbol  k,  such  as: 

B  =  {bjk],  \<j<N,l<k<M,  (2.7) 


with  the  following  properties  (like  A): 


bjkiO  \/j,k, 

M 

y/. 

jfe=i 


(2.8) 


Note  that  B  is  a  NxM  matrix  where  N  represents  the  number  of  hidden  states  and  M  the 
number  of  symbols.  Matrix  B  isn’t  restricted  to  be  square.  In  our  example,  a  possible  B 
matrix  may  be  defined  as: 
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B  =  bjk  = 


0.3  0.5  0.2 
0.4  0.4  0.2 
0.6  0.3  0.1 

The  statistical  model  is  completely  described  by  the  set  of  matrices  A,  B,  and  the 
vector  7t  and  usually  denoted  as: 

X  =  {A,B,z}  (2.9) 

Note  that  all  rows  of  matrices  A,  B  and  n  shown  sum-up  to  1,  thereby  providing  an  easy 
check  on  the  validity  of  the  model.  Extending  our  example  to  a  general  case,  Table  2.3 
lists  the  definition  of  all  HMM  parameters,  as  given  by  Rabiner  [3],  for  a  generic  HMM 
model: 

Table  2.3  HMMs  elements 

•  T:  Length  of  the  Observation  Sequence  (total  number  of  clock  times) 

•  N:  Number  of  States  in  the  Model 

•  M:  Number  of  observation  symbols 

•  Q={qb  q2,  • .  • ,  qN}:  States 

•  V={vi,  V2,  ...»  vm}:  Discrete  Set  of  Possible  Symbol  Observations 

•  A  =  {ay},  ay  =  Pr{qj  @  t+l|qi  @  t}:  State  Transition  Probability 
Distribution 

•  B  =  {bj(k)},  bj(k)  =  Pr(Ot=Vk|qt=Sj):  Observation  Symbol  Probability 
Distribution  in  State  j 

•  n  ={7ii},  Til  =  Pr(qi  @  t  =  1):  Initial  State  Distribution 


12 


D.  THE  THREE  BASIC  PROBLEMS  FOR  HMM’S 

Three  basic  problems  of  interest  must  be  solved  in  order  to  specify  the  model  X  = 

{A,  B,  n},  and  use  it  in  classification  applications  [3]: 

Problem  1:  How  to  compute  the  probability  of  the  observation  sequence  Pr(0|X), 
forO=  {Oi,02, . . .  ,Ot}. 

Problem  2:  How  to  compute  the  most  optimal  state  sequence  I  =  {ii,  i2,  •  •  •  dr}, 
for  a  given  observation  sequence  O  =  {Oi,  O2, . . . ,  Ot}  and  model  X. 

Problem  3:  How  to  adjust  the  model  parameters  X  =  {A,  B,  n}  to  maximize 
Pr(0|A). 

Problem  1  is  an  evaluation  problem,  because  we  want  to  evaluate  the  probability 
Pr(0|A,),  for  the  specific  model.  The  hidden  part  of  the  model  is  the  state  sequence  I  that 
we  attempt  to  find  in  Problem  2.  Note  that  there  are  many  possible  state  sequences,  but 
only  one  is  optimal.  Finally  in  Problem  3,  we  adjust  the  parameters  A,  B,  and  te  of  the 
model  X,  so  that  the  probability  Pr(0|X)  is  maximum,  i.e.,  we  attempt  to  optimize  the 
model  X. 

E.  SOLUTIONS  TO  THE  THREE  HMM  PROBLEMS 
1.  Problem  1:  Evaluation  of  Pr(0|X) 

Problem  1  deals  with  evaluating  the  probability  of  the  observation  sequence  O, 
given  the  model  X,  i.e.  Pr(0|A,).  Using  basic  probability  principles,  it  is  the  summation  of 
the  conditional  probability  Pr(0,I|X)  over  all  possible  state  sequences  I  [3]: 

Pr(0 1  A)  =  5;Pr(0 1  I,A)Pt{I  \  A,) ,  (2.10) 

all  I 
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This  evaluation  first  requires  the  definition  of  Pr(0,I|A,),  the  joint  probability  that  the 
observation  0={0i,  O2, . . .  ,Ot}  and  the  state  sequence  I={ii,  h,  •  •  ■ ,  ir}  occur  at  the 
same  time,  given  the  model  X  [6].  Using  the  Bayes’  rule  it  is  computed  as: 

Pr(0,  I\X)  =  Pr(C>  I  /,;i)Pr(/ 1 X) . 

Since  we  assume  independence  of  the  observation,  we  can  write: 

Pr((9  \l,X)  =  f[  Pr(a  I  i,,A)  =bi,  (Oi)bi,  (O2). .  .bi,  (Or) , 

(=1 

and 

Therefore,  replacing  eqs  (2.10 ),  and  (2.13)  into  (2.11 )  leads  to: 

I  {02)ai,i,bh  {Oi} -ai,.,i,bi^  (Ot)  . 

all  i 

As  an  example  of  this  computation.  Let’ s  apply  the  above  result  to  the  um-ball  example 
considered  earlier  in  section  C  and  in  Figure  2.3.  The  components  of  the  model 

X(A,  B,  7i)  are: 

'0.5  0.3  0.2]  ro.3  0.5  0.2' 

A=  0  0.8  0.2  ,  B=  0.4  0.4  0.2,  ;r  =  {0.9, 0.1, 0} . 

0  0  1  J  [0.6  0.3  0.1 

The  observation  sequence  is: 

0  =  {l.  2, 2,3;, 


(2.11) 


(2.12) 


(2.13) 


(2.14) 


14 


with  the  number  of  states  N=3,  clock  times  T=4,  and  the  symbol  dimension  M=3.  Recall 
that  we  don’t  know  the  state  sequence,  i.e.,  we  cannot  identify  which  specific  um,  the 
color  balls  came  from.  Using  eq  (2.14)  we  can  find  the  hidden  state  sequence  I,  by 
picking  that  which  leads  to  the  highest  probability  Pr(0,I|X).  For  example,  possible  state 
sequence  that  we  can  use  are  I={  1,1, 1,1 }  (which  means  that  all  balls  come  from  the  first 
um),  or  I={1,1,1,2}.  Note  that  this  model  is  a  left-to-right  model,  as  the  state  sequence 
doesn’t  go  backwards,  thus,  for  instance  the  state  sequence  I={  1, 3,2,2}  is  not  valid  as  we 
can’t  go  from  the  3rd  to  the  2nd  state.  Therefore: 


for  /  =  /  1, 1, 1,  2 }:  (first  3  balls  from  U'  um, 
last  one  from  2"^  um) 
Ttilbh{0'ifxhnbi2{02)anhbii{f)'^  •  •  •  atr  -  \irbiT(f)T ) 

=  T€i\bi)f\^aiuibii{^f^aiziibii{^^'  "aiT  -  mbtri^) 


=  0.9  0.3  0i0.5  0i0.5  0.3-0.2 
=  0.0010125 


for  I  ={3,3,21,  3}:  (all  balls  from  3'^  um) 

Tti\bh{pi)anizbi2{Oi)a  hizbii  (o,)..  '  aiT  -  uibirif^T ) 

=  ;rii£j/i(l)fliu2&/2(2)<2i2;3i>i3(2)-  •  -  air  -  xubtri^) 


=  0.90.610.310.3-1-0.1 


=  0.00486 


As  we  can  see,  the  number  of  combinations  of  all  possible  sequences  is  very 
large,  even  though  the  number  of  the  observations  T  is  relatively  small.  In  this  problem 
(T=4),  the  direct  computation  of  Pr(0|X)  requires  (2T-1)N^  multiplications  and  N^-1 
additions,  which  is  too  expensive  for  real  applications.  For  instance,  the  number  of 
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computations  is  (2*100-l)3*°°+3*°°-l  when  T=100  and  N=3.  Therefore,  the  more 
efficient  forward-backward  procedure  was  introduced  to  solve  this  problem  [3,  6]. 

a)  Forward  Procedure: 

We  define  the  forward  variable,  at(i),  as  the  probability  of  the  partial 
observation  sequence  {Oi,  O2, . . .  ,Ot},  until  time  t  and  state  qi  at  time  t,  given  the  model 
X,  such  as: 

a,(i)  =  Px(0„0^,...,0„i,  =q\X).  (2.15) 

The  forward  algorithm  defined  below  is  used  to  evaluate  all  possible  cxt(i)  variables: 

Initialization: 

a,(i)  =  Tc,b,(0, ),  \<i<N  (2. 16) 

Recursion: 

a,Jj)  =  ]bj(0„, )  fort  =  1,  2, . ..,7-1,  l<j<N  (2.17) 

1=1 

Termination: 

H0\^)=Yi.^^r(i)■  (2.18) 

Eq  (2.18)  is  also  know  as  the  Baum-Welch  probability.  As  we  can  see 

from  the  recursion,  the  forward  procedure  gets  its  name  from  the  fact  that  Ut+i  is  evaluated 
using  the  previous  value  of  the  forward  variable  Ot.  Computing  Pr(0|A,)  using  this 
procedure  requires  only  calculations  instead  of  the  2TN^-1  required  by  the  direct 
definition  [3].  For  example  direct  computation  requires  2TN^=7.4.10*^  computations, 
while  the  forward  procedure  requires  N^=1920  for  N=8  and  T=30. 


16 


ti  t2  ta  t4 


Clock  Time 

. > 


:  Current  position  (State,  Time).  Total  number  of  positions=N»T 

:  Possible  movement 


:  Most  likely  path 


Figure  2.4a  Trellis  Diagram  for  T=4,  and  N=3,  ergotic  model 
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ti  t2  ta  t4 


Clock  Time 

. > 


:  Current  position  (State,  Time).  Total  number  of  positions=N*T 
/f  :  Possible  movement  (for  the  left-right  model  they  are  less) 


:Most  likely  path 


Figure2.4b  Trellis  Diagram  for  T=4,  and  N=3,  left-to-right 
model  as  used  for  example  in  Figure  2.3.  Fewer  possible  paths 
than  in  ergotic  model  are  allowed. 


The  computational  savings  obtained  in  the  forward  procedure  are 
illustrated  in  Figure  2.4a  and  2.4b  which  present  all  possible  combinations  of  state 
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sequences  from  clock  time  ti  to  U  in  a  trellis  diagram.  Figure  2.4a  presents  the  trellis 
diagram  for  an  ergotic  model,  and  T=4,  N=3.  Each  calculation  of  cxt(i)  with  the  forward 
procedure  only  requires  the  computation  of  the  values  cct.i(j),  for  l<j^.  Note  that  the 
procedure  discards  the  routes  less  likely  to  occur,  so  that  the  probabilities  through  the 
previously  discarded  routes  do  not  get  reevaluated  at  the  next  iteration  time. 

Finally,  the  probability  Pr(0|^)  is  obtained  by  the  summation  in  eq  (2.18). 

b)  Backward  Procedure: 

Similarly,  we  define  the  backward  variable  Pt(i)  as  the  probability  of  the 
partial  observation  sequence  from  t+1  to  the  end,  given  qi  at  time  t  and  the  model  X  [3], 
where: 

p,(i)  =  Pt(0,,,.  . . .  Or\ir  =q,,  X).  (2.19) 

Pt(i)  is  computed  recursively  from  t=T  down  to  t=l  as  follows: 

Initialization: 

=  l<i<N  .  (2.20) 

Recursion: 

P,(i)  =  f^ayb/0„, )l3,Ji),  r  =  r-1 . 7’-2 , . . . ,  1  ,l<i<N.  (2.21) 

;=i 

Note  that  the  backward  variable  is  issued  to  re-estimate  the  model  X  (Problem  3),  and  that 
the  number  of  computations  is  decreased  to  N^. 

Therefore,  using  the  definitions  of  the  forward  and  backward  variables, 
and  according  to  [6]: 

Pr{0\X)  =  '^a,(i)P,(i),  t  =  (2.22) 

i=l 
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Um-Ball  Example: 


Forward  and  backward  variables  are  evaluated  for  the  um-ball  example 
considered  earlier.  Recall  for  this  example: 


'0.5 

0.3 

0.2" 

"0.3 

0.5 

0.2' 

A  = 

0 

0.8 

0.2 

,  B  = 

0.4 

0.4 

0.2 

0 

0 

1 

0.6 

0.3 

0.1 

71  =  {0.9,  0.1,  0} , 


0  =  {\,  2,  2,3;. 


Thus,  using  eq  (2.16)  leads  to: 


a/ 1 ;  =  TtMOx  )  =  Q.9-b,(\)  =  0.9  •  0.3  =  0.27 


a,(2)  =  Tc^b^iO, )  =  0. 1  •  1 )  =  0. 1  ■  0.4  =  0.04 


a,(3)  =  Tc^b^(OJ  =  0-b^(\)  =  0. 

Next,  using  the  recursion  formula  (2.17)  leads  to  the  following  values  for 
the  forward  variable  (Xt(i)  for  t=l,  2, ,  T-1  and  1^^.  For  example, 

1=1 

=  (E«.(‘>„)i>.(2) 

1=1 

=  (ofj  (l)  *  J  CCi  (2)  •  €l2\  ^1  (3)  *  ^31  ) ' 

=  (0.27-0.5  +  0.04-0  +  0)- 0.5 
=  0.0675 
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a,(2)  =  (Oj 

/=! 

=  (Xa,(iK,)'ij(2) 

/=1 

=  (flfj  (l)  •  £Zj2  +  <3fj  (2)  ■  <222  ■^^1(3)'  0.yi )  ■  0-4 

=  (0.27-0.3 +  0.04 -0.8  +  0) -0.4 
=  0.0452 

^^2(3)  =  (E^iOK)-^3(<^2) 

i=l 

=  (i;a,(-K,)i,(2) 

=  (ofj  (1)  -  <2j3  +  Qfj  (2)  -  <223  ^1  (^) '  ^33  )  ’ 

=  (0.27 -0.2 +  0.04 -0.2  +  0) -0.3 
=  0.0186. 

All  values  of  the  forward  variable  are  shown  below  in  the  matrix  Afw: 

0.27  0.04  0 

0.06  0.0452  0.0186 

0.016875  0.022564  0.012342 

0.0016875  0.00462274  0.00202298 

which  is  of  dimension:  4x3  (TxN).  Similarly,  we  compute  the  backward  variable, 

according  to  eqs  (2.20)  &  (2.21)  starting  from  t=T.  Recall  from  eq  (2.20)  that: 

PJii)  =  l,  V/  =  1,2,3. 

Next,  using  the  recursion  formula  given  in  eq  (2.21)  to  compute  p3(i)  leads  to: 
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;=1 

7=1 

=  [«!!  •Z),(3)  +  ai2  •i>2(3)  +  <2i3 
=  (0.5  •  0.2 +  0.3  0.2  + 0.2 -O.l)-! 

=  0.18, 

All  values  for  the  backward  variable  PtO)  are  shown  below  in  the  matrix  Bbw  as: 


0.027582 

0.022152 

0.009 

0.0726 

0.0636 

0.03 

0.18 

0.18 

0.1 

1 

1 

1 

Note,  that  the  values  in  the  last  row  are  always  1,  according  to  eq  (2.20).  At  this  point, 
we  can  evaluate  Pr(0|A,),  using  the  forward  variable,  according  to  eq  (2.18),  which  leads 
to: 


Pr(0 1  A) = a^i  i)  =  a  A  i)  =  cc^  (1)  +  a^i2)  +  a^  (3)  +  (4) 

=  0.0016875  +  0.00462274  +  0.00202298 
=  0.00833322000000. 

2.  Problem  2:  Optimal  State  Estimation 

Problem  2  deals  with  finding  the  optimal  state  sequence  I  for  a  given  observation 
sequence  O.  One  possible  solution  to  this  problem  is  to  maximize  the  expected  number 
of  correct  individual  states  by  choosing  the  states  it  that  are  more  likely  to  occur  [3].  This 
computation  uses  the  variable  Yt(i),  defined  as  the  probability  of  being  in  state  qi  at  time  t, 
given  the  observation  O  and  the  model  X  [3]: 

(2.23) 
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Yt(i)  can  be  expressed  as: 


^(0A(0 

Pr(0|;i)  ’ 


(2.24) 


where  at(t),  Pt(t)  are  the  forward  and  backward  variables,  defined  earlier.  Replacing  eq 
(2.23)  into  eq  (2.14)  leads  to: 


Ea,(0A(0 


(2.25) 


Note  that  the  normalization  factor  Pr(0|A,)  in  the  denominator  of  eq  (2.24)  is  needed  to 
make  Yt(i)  a  conditional  probability,  which  leads  to  the  following  constrain  being 
satisfied: 


En(0  =  l-  (2.26) 

1^1 

Finally,  the  optimal  state  sequence  ii  can  be  obtained  by: 

i,  =argmax[y,(i)],  l<t<r.  (2.27) 

A  more  efficient  approach  to  compute  the  optimal  state  sequence  uses  a  decoding  based 
on  dynamic  programming  called  Viterbi  algorithm  and  shown  in  Table  2.4.  The  Viterbi 
algorithm  is  designed  to  find  the  best  path  (sequence)  which  maximizes  the  probability 
Pr(0,IlX)[3]. 
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Table  2.4  Viterbi  Algorithm 


As  an  application,  let’s  go  back  to  example  3  defined  earlier  in  section  3  to  find 
the  optimal  sequence  it,  given  the  model  X  and  the  observation  O.  Recall  that,  for  the 
given  model: 

'0.5  0.3  0.2]  ro.3  0.5  0.2“ 

A  =  {a,y}=  0  0.8  0.2,  B  =  {bjk]=  OA  0.4  0.2,  ;;r  =  {0.9,  0.1,  0},  O  =  {l,  2, 2, 3} 

0  0  1  J  [0.6  0.3  0.1_ 

Thus,  according  to  Table  2.4,  we  get: 
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3i(i)  =  7i:  ibi(Oi),  l<z<A^ 

(l)  =  7c,b^(0^  j  =  0.9  •  0.3  =  0.27 
(5,  (2)  =  7C2b2(0, ;  =  0.1  •  0.4  =  0.04 
S,(3)  =  !t,b,(OJ  =  0 

<Pi(i)  =  0 

U)  =  i}hj  K  {0,),l<j<N,  so: 

^2  (l)  =  (i)an  >i  {O2 )  =max[^i  {l^uA  i^)a2iA  (3)«3i  h 


Similarly,  after  computing  all  5  values,  we  define  the  matrix  A,  such  as: 


A  =  {<^,(7)}= 


0.27  0.04  0 

0.0675  0.0324  0.0162 

0.016875  0.010368  0.00486 

0.0016875  0.00165888  0.000486 


Then,  we  compute  the  (p  variable,  defined  in  Table  2.3  as: 

^,(;)  =  argmax[<J,_j(z)flJ. 

l<(<Af 

For  example,  92(1)  is  defined  as: 

^2(1)  =  argmax[(5,(zki]=  argmax[5i(l)aii,^i(2)a2i,<5i  (3)031] 

=  arg  max[0.27  •  0.5,0.04  •  0,0]  =  1. 

Similarly,  we  compute  all  other  (pt(j)  variable,  and  define  the  <I>  matrix: 


0 

1 

1 

1 


0  0 
1  1 
2  3 
2  3 


Next,  we  find  the  last  state  at  time  T=4  defined  as: 
z*  =argmax[<53.(z)], 

l^iSN 

which  leads  to: 
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I*  =  arg  max[<?4  (i)]  =  arg  Tnax[S^  (l),  <^3  (2),  (3)] 

\<i^N 

=  argmax[<J3(l),(J3(2),<53(3)]=argmax[0.0016875  0.0016588  0.000486] 

=  1. 

Finally,  the  other  state  sequence  are  defined  using  the  backtracking  method  of  the  Viterbi 
algorithm  given  in  Table  2.4,  which  leads  to: 

C  =^,+i(Ci)>  fort  =  3,2,1. 

=^4fc)=9>4W  =  l 
'2=«»3fe)=«’3W  =  l 

h  =«»2fe)=^2W  =  l- 

Therefore,  the  optimal  state  sequence  for  this  model  and  observation  sequence  is  given  by 
it={  1,1,1,1 },  which  shows  that  all  balls  come  from  the  first  urn.  Note  that  repeating  the 
same  experiment  with  the  observation  sequence,  0={1,2,3,3},  results  in  iF{l, 2,2,2}. 
Further,  note  that  we  always  expect  a  forward  moving  state  sequence,  since  our  model  is 
a  left-right  model.  In  addition,  we  can  also  use  the  variable  5t(j)  to  evaluate  Pr(0|A,),  and 
thus  we  introduce  the  Viterbi  probability  Ptv,  such  as: 

Pr  =  max{(5j.  (i)}.  (2.28) 

Using  eq  (2.28)  leads  to: 

Pr,  =  max{J4  (i)}=  max{<^4  (1) ,  (2) ,  (3)} 

=  max{0.0016875  0.00165888  0.000486} 

=  0.0016875, 

which  is  consistent  (meaning  in  the  same  range)  to  the  value  0.00833  found  earlier  using 
the  Baum-Welch  probability  eq  (2.18).  Note  that  we  can’t  expect  to  find  the  same  exact 
value  (since  theoretically  is  not  the  same),  but  we  can  use  both  probabilities  to  reconfirm 
the  decision. 
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3. 


Problem  3:  Re-estimation  of  Model  Parameters 


The  last  problem  deals  with  adjusting  the  model  parameters  (A,  B,  7t),  so  that 
probability  Pr(0|A,)  is  maximum.  To  that  end,  we  define  the  variable  ^t(ij)  as  the 
probability  of  a  path  being  in  state  qi  at  time  t  and  making  a  transition  to  state  qi  at  time 
t+1,  given  the  observation  sequence  and  the  model,  such  as  [3]: 


e  / .  (0,^1  )A+i  (7) 

Pr(Ol/l) 


(2.29) 


Recalling  the  definition  of  Yt(i)  given  earlier  in  eq  (2.25)  as  the  the  probability  of  being  in 
state  qi  at  time  t,  given  the  observation  O  and  the  model  X,  and  using  eqs  (2.23)  and 
(2.24),  we  can  relate  Yt(i)  to  4t(i),  by  summing  ^t(i)  over  all  states  j,  which  leads  to: 


(2.30) 

;=i 

Similarly,  summing  Yt(i)  and  ^t(i)  for  all  t’s,  leads  to  the  following  result[3]: 
r-j 

(z)  =  Expected  number  of  transitions  made  from  state  ,  (2.31) 

r=l 


and: 

r-i 

^  (if)  =  Expected  number  oftransitions  made  from  state  q^  to  state  qj .  (2.32) 

/=! 

Finally,  using  eqs  (2.31),  (2.32)  and  the  definitions  of  the  model  parameters,  we  can 
reestimate  the  model,  according  to  the  Baum-Welch  formulas: 


(2.34) 


r-i 


E  ) 


S7,(0 


r=l 


Sy,(0 

3.b,(k)='4^ -  .  (2.35) 

Er,(') 


We  then  continue  reestimating  our  model  (applying  the  new  model  to  the  variables  y  and 
4),  with  the  re-estimations  formulas  defined  above,  until  we  reach  convergence,  i.e.,  so 
that: 


''old  ■ 


Application:  Um-ball  example. 

Next,  we  apply  the  re-estimation  formulas  to  our  um-ball  model  described  earlier: 


Recall: 


"0.5 

0.3 

0.2‘ 

'0.3 

0.5 

0.2' 

A  =  {aij]= 

0 

0.8 

0.2 

,B  =  {bjk]= 

0.4 

0.4 

0.2 

0 

0 

1 

0.6 

0.3 

0.1 

n  =  {7t}=[0.9,  0.1,  0}. 


For  the  observation:  O  =  {1,2,  2,3},  and  a  single  iteration  we  get: 


‘0.6256 

0.2945 

0.0079' 

‘0.4362 

0.4649 

0.0988' 

A  =  {ai]]= 

0 

0.8984 

0.1015 

,  B  =  {bjk]= 

0.0712 

0.5573 

0.3714 

0 

0 

1 

0 

0.4697 

0.5302_ 

n  =  {;5}=  {0.8936,  0.1063  0}. 
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Note  that,  1)  the  summation  all  rows  of  the  matrixes  A,  B  and  vector  n  remains  equal  to 
1,  and  2)  A  is  still  an  upper  triangular  matrix,  since  our  model  is  let-right.  Re-estimating 
the  model  for  10  iterations  leads  to: 


0 

1 

0  1 

'1 

0 

0 

A  = 

0 

0.3341 

0.6658 

,  B  = 

0 

1 

0 

0 

0 

1 

0 

0.3325 

0.6674 

n  =  {l,0,0}. 

As  we  can  see  from  the  above  results,  the  matrix  A  still  remains  an  upper 
triangular,  which  means  that  the  model  is  still  a  left-right  model.  Examining  the  B  matrix, 
we  conclude  that,  if  we  are  in  the  first  state,  we  will  observe  only  the  1®'  symbol  (1**  row, 
1*'  column  =1),  etc.  Finally,  the  n  matrix,  shows  that  procedure  starts  from  the  first  state. 

F.  SCALING 

Numerical  implementations  of  the  forward-backward,  Baum-Welch  or  Viterbi 
algorithm  may  lead  to  underflow  problems  due  to  the  small  numbers  involved  in  the 
required  computations.  In  addition,  the  problem  may  become  worse,  as  the  matrix 
dimensions  involved  in  the  computations  increase.  As  a  result,  scaling  is  required  to 
avoid  such  mathematical  problems  [4].  The  basic  idea  behind  the  scaling  procedure  is  to 
multiply  the  forward  and  backward  variables  OtO)  and  3t(i)  by  a  coefficient  so  that  the 
scaled  a,  (/)  and  (i)  are  kept  within  the  dynamic  range  of  the  computer. 

Note  that  we  can  rewrite  the  reestimation  formula  eq  (2.34),  in  terms  of  the  the 
forward  and  backward  variables,  as  [4]: 

- .  (2.36) 

EE";  (0,^1  )A+1  (J) 

1=1  M 
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Consider  the  forward  algorithm  used  to  compute  of  the  forward  variable  ott(i),  discussed 
earlier.  at(i)  is  multiplied  by  the  “scaling”  factor  Ct  defined  as: 


c,  = 


<  N 


i=l 


leading  to  the  scaled  forward  variable  defined  as: 

4  (0  =  c, «,(;)• 

The  scaled  coefficient  a,  (z)  can  be  shown  to  be  equal  to  [4,  pg.  272] 


(2.37) 


(2.38) 


=  - . 

1=1  M 

We  also  define  a  modified  forward  variable,  as: 

;=i 

By  induction  ^_i(7)  can  be  written  in  terms  of  as: 


Vr=i  ; 


Thus: 


/-I  ^ 


a,(i)  = 


1=1  ;=1 


T=1 


1=1 


The  scaled  backward  variable  (z)  is  defined  as: 


(2.39) 


(2.40) 


(2.41) 


(2.42) 


PM)  =  c,PM)- 


(2.43) 
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At  this  point,  we  can  rewrite  the  reestimation  formula  given  in  eq  (2.36),  using  the  scaled 
forward  variable  such  as: 


T-\ 


a  ij  =  — — — 
“y  T 


(2.44) 


SE^<0‘K^(^^r+i)A+i(7) 

f=l  ;=1 

Similarly  the  re-estimation  formula  for  bj(k)  eq  (2.35)  becomes: 


r-i 


l&xmu) 


t=l 


(2.45) 


f=i 


Finally,  it  can  be  proved  that  the  probability  Pr(0|A,)  and  the  Viterbi  probability  Pry  can 
be  computed  using  the  scaling  factor  [4], which  leads  to: 

Pr(0|;i)  =  ^,  (2.46) 

/=i 

or, 

log[Pr(0|/l)]=-2logc,  (2.47) 

/=1 

log(PrJ  =  m^[(Zij.(z)].  (2.48) 

G.  MULTIPLE  OBSERVATION  SEQUENCES 

In  real  word  applications  we  need  to  train  the  HMM  using  multiple  trials,  to 
ensure  robustness  in  the  recognition/classification  process.  Each  training  trial  signal 
produces  an  observation  sequence.  Assume  that  there  are  K  trials,  i.e.,  K  observation 

sequences.  We  denote  the  set  of  observation  sequences  O  such  as: 
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where; 


(2.49) 


(2.50) 

is  the  k*  observation  sequence.  Provided  that  each  observation  sequence  is  independent 
of  each  other  and  identically  distributed,  we  want  to  find  the  model  which  maximizes  the 
probability; 

Pr(OlA)  =  nPr(o(*)|;i)  =  nP*  .  (2.51) 

*:=!  k=\ 


Therefore,  the  multiple  observation  reestimation  formulas  using  the  scaled  variables  are; 


a.j  = 


ik^l 


(2.52) 


br.  = 


*oT‘)=v, 


n-1 


*=1  “k  (=1 


(2.53) 


Note  that  we  don’t  reestimate  rq,  so  we  keep  it  unchanged  through  the  iterations. 
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III.  ISOLATED  WORD  RECOGNITION 


This  section  presents  the  application  of  HMMs  to  speech  recognition. 
Specifically,  we  show  how  the  model  parameters  have  to  be  selected  and  adjusted  using  a 
limited  4-word  recognition  example.  Our  goal  is  to  show  how  a  more  generic  classifier 
can  be  set-up  through  the  speech  recognition  example.  Li  addition,  we  point  out  the 
potential  difficulties  one  may  encounter  while  setting  up  and  implementing  the  software 
for  such  an  application,  due  to  computer  numerical  precision  limitations. 

A.  GENERAL  HMM  TRAINING  AND  TEST  PROCEDURE 

This  section  describes  the  application  of  HMMs  to  classification  in  the  context  of 
speech  recognition.  The  overall  procedure  is  illustrated  in  Figure  3.1.  First,  labeled  data  is 
used  to  train  the  model.  These  specific  training  signals  may  be  multiple  trials  of  the  same 
type  of  signal,  i.e.,  belong  to  the  same  class,  or  belong  to  different  classes.  For  example, 
multiple  trials  of  the  same  word  may  belong  to  the  same  signal  class  in  speech 
recognition  applications.  In  such  a  case,  all  words  used  in  the  recognition  set-up 
constitute  the  dictionary.  Next,  information  uniquely  characterizing  each  class  needs  to 
be  extracted  from  the  signal  classes.  Thus,  each  signal  is  split  into  T  segments,  and  some 
useful  features  extracted  from  each.  Feature  vectors  may  include  LCP  or  cepstral 
coefficients,  energy,  etc. . .  Thus,  the  initial  signals  are  converted  into  a  set  of  continued¬ 
valued  vectors.  Next,  this  set  gets  converted  into  a  sequence  of  discrete  vectors  using 
vector  quantization  (VQ)  [4,5],  which  will  be  described  further  in  Section  2.  The  set  of 
M  discrete  symbols  forms  the  codebook.  For  example,  assume  we  have  a  speech  signal 
divided  into  4  segments  (i.e.,  T=4),  and  that  the  set  of  symbols  representing  one  signal 
segment  is  a  number  ranging  from  1  to  8.  Thus,  a  possible  observation  sequence  of  the 
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k*  signal  may  be  given  as  0*^  =  {7,  3,  4,  8}.  At  this  point,  we  can  check  whether  the 
features  extraction  method  and  dimension  of  the  codebook  M  makes  sense  by  comparing 
the  observation  sequences  obtained  for  the  signals  of  the  same  class,  as  it  is  reasonable  to 
expect  some  similarity  between  the  resulting  sequences. 

Recall  that  the  set  of  observations  derived  from  each  class  of  training  signals  is 
the  only  information  used  to  train  a  class-specific  model  A,{A,  B,  7t}.  First,  an  initial 
estimate  of  the  model  is  required  to  apply  the  Baum-Welch  algorithm,  as  described 
earlier  in  Section  n.  Initial  values  may  be  set  randomly  so  that  they  satisfy  the  model 
constraints  and  converge  to  the  correct  type  of  model,  i.e.,  the  initial  matrix  A  is  to  be 
selected  as  upper  triangular  for  left-to-right  models  so  that  it  converges  to  an  upper 
triangular  matrix.  The  Baum-Welch  algorithm  iteratively  estimates  the  model  and  stops 
when  there  is  no  significant  model  parameter  changes  between  two  successive  iterations. 
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Training 

Signals 


Initial  Model  Conditions 
(initial  model  Xi) 


Figure  3.1  HMM  General  Training  and  Testing  Procedure 
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Unlabeled  signals  are  used  during  the  testing  phase  to  identify  which  class  they  belong  to. 
First,  testing  data  are  pre-processed  to  generate  the  codebook  vectors  and  corresponding 
observation  sequences,  following  the  same  process  as  that  used  during  the  training  phase. 
Next,  the  set  of  the  probabilities  observing  the  tested  signal  O,  given  the  k*  model, 
PrCOlX^*^^)  is  computed  using  either  the  Baum-Welsh  or  the  Viterbi  algorithm  for  all  k* 
models,  as  discussed  earlier  in  Section  EL.  Finally,  the  model  type  is  selected  by  choosing 
that  with  the  highest  probability,  PrCOlX"***) . 

Next,  we  describe  a  simple  4-word  recognizer  designed  to  recognize  the  words: 
Statistics,  Microsoft,  Instructor,  and  Professor.  Three  trial  words  are  used  for  training  and 
one  word  is  use  for  testing  for  each  class. 

1.  Data  Creation  and  Preparation 

All  data  were  recorded  on  the  same  machine  running  Windows-98  Sound 
Recorder  with  a  sampling  rate  of  8000Hz  sampling  and  8bit  mono  encoding.  One  male 
speaker  was  used.  However,  note  that  there  are  some  variations  between  the  trials  so  that 
no  word  is  pronounced  twice  exactly  the  same  way.  An  energy  detector  was  applied  to 
remove  silence  before  and  after  each  word,  resulting  in  word  of  about  9000  points.  Next, 
each  signal  was  interpolated  to  10000  points  to  obtain  trials  with  the  same  length.  Finally, 
each  data  was  sent  through  a  pre-emphasis  filter  with  transfer  function  H(z)=l-ctz'*  with 
0=0.98,  to  emphasize  the  relative  energy  of  the  high-frequency  spectrum  which  contain 
useful  information.  The  MATLAB  software  implementation  is  presented  Appendix  A.3. 

2.  Feature  Extraction 

We  varied  the  length  and  number  of  segments  (time  frame),  and  the  specific  type  of 
feature  parameters  until  we  obtained  a  combination  which  lead  to  correct  classirication. 
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The  final  set  of  parameters  was  derived  by  each  word  into  T=7  segments  using  a 
rectangular  window,  with  an  overlap  of  10%,  where  the  time  length  of  each  trial  was 
about  Isec.  The  following  eight  parameters  were  extracted  for  later  use  in  VQ  from  each 
segment:  LPC  coefficients  from  a  7*  order  filter  derived  with  the  covariance  method  [1], 
and  the  energy  of  the  section.  Finally,  we  normalize  the  value  of  the  energy  dividing  all 
values  by  the  max  energy. 

3.  Vector  Quantization 

Vector  quantization  (VQ)  is  a  scheme,  which  maps  a  sequence  of  continuous¬ 
valued  vectors  into  a  sequence  with  a  given  number  of  discrete  vectors,  called  the 
codebopk  [7].  Therefore  VQ  can  be  viewed  as  some  type  of  encoding  scheme,  where  the 
encoder  y  assigns  a  channel  symbol  y(x)  from  an  ensemble  of  M  symbols  to  each  input 
vector  x={xo,  xi,...,  Xn}  [7].  Note  that  there  is  no  need  to  define  a  decoder,  as  the 
discrete  sets  of  parameters  never  get  translated  back  into  the  original  vector. 

Basically,  VQ  partitions  the  set  of  coefficients  into  M  disjoint  sets.  Each  set  is 
represented  by  a  single  vector  {Vni}l^m<M,  which  is  a  centroid  of  the  vectors  in  the 
coefficient  set  assigned  to  the  m“*  region  [4]. 

Note  that  there  is  a  distortion  penalty  associated  with  VQ,  as  all  feature  vectors 
are  represented  using  a  set  of  M  codebook  vectors.  The  larger  the  dimension  of  the 
codebook,  the  smaller  is  the  overall  distortion  between  original  feature  vectors  x  and 
codebook  vectors  Xr.  The  distortion  measure  d(x,Xr)  resulting  from  the  codebook  selection 
process  can  be  represented  using  the  Euclidean  norm  as  [7]: 

d{x,x^)  =  \\x-x£ .  (3.1) 
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For  our  purpose  two  schemes  were  applied  to  derive  the  codebook  vectors:  1)  a 
competitive  Neural  Networks  implementation,  and  2)  a  K-means  (or  LGB  algorithm) 
algorithm  [7,  8]. 

a)  Competitive  Neural  Network  Implementation 
Input  Competitive  Layer 


R  S 


Figure  3.2  Competitive  Neural  Network 

A  competitive  unsupervised  neural  network  (NN)  implementation  was 
selected  to  compute  the  codebook,  as  shown  in  Figure  3.2  [2].  Basically,  this  NN  can  be 
viewed  as  a  clustering  scheme,  where  weights  associated  to  each  neuron  are  used  to 
assign  a  symbol  M  to  each  word  segment.  The  software  implementation  is  presented  in 
Appendix  A.I. 
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Vector4,  Class=3  Vectors,  Classes  Vectors,  Class=3 


Vector?,  Gasses  Euclidean  Distance  original/quantized  vectors 


Figure  3.3  Diagram  of  coefficient  vectors  and  VQ  vectors,  Euclidean  distance  for  the 
word  “statistics”,  using  a  competitive  neural  network 

Figure  3.3  shows  seven  original  feature  vectors,  and  the  corresponding 
quantized  vectors  obtained  with  the  competitive  NN.  Training  took  about  2000  epochs 
and  50mn  with  a  Pentium-in  450MHz.  Each  initial  feature  vector  is  mapped  into  one  of 
the  M=4  quantized  vectors.  For  example,  vectors  2  and  3  get  mapped  to  the  same  fourth 
class  due  to  their  consistency.  Similarly,  vectors  5,  6,  and  7  get  mapped  to  the  3*^  class  of 
the  quantized  vectors. 

b)  K-means  Scheme 

The  second  method  considered  is  the  K-means  algorithm,  also  called  the 
Lloyd  or  Linde-Buzo-Gray  (LBG)  algorithm  [9].  The  software  implementation  is  given 
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in  Appendices  A.13  and  A.14  [8].  Basically,  the  K-means  algorithm  is  a  clustering 
scheme,  which  iteratively  finds  a  set  of  k,  quantized  vectors  into  which  all  training 
vectors  get  mapped  to  with  minimum  distortion.  The  number  of  clusters  increases 
iteratively  by  splitting  the  existing  quantized  vectors  obtained  at  each  iteration,  until  a 
desired  number  of  quantized  vectors  or  distortion  levels  is  obtained  [9].  Thus,  the  size  of 
the  codebook  is  a  power  of  two,  i.e.,  2,  4,  8,  16,  etc...  The  LGB  algorithm  is  illustrated 
for  the  word  “Microsoft”  in  Figures  3.3  and  3.4  for  codebook  sizes  equal  to  4  and  32 
respectively. 
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1.5 


Vectorl,  Class=3 


Vector2,  Class-1 


Vectors,  Class=2 


Figure  3.4  Diagram  of  coefficient  vectors  and  VQ  vectors,  Euclidean  Distance  for  word 

“Microsoft,”  using  the  K-means  (LGB)  algorithm  and  M=32. 

Note  the  distortions  are  much  smaller  when  M=32  than  when  M=4.  This 
is  to  be  expected  as  a  larger  codebook  size  allows  more  flexibility.  However,  a  large 
codebook  may  not  be  always  desirable  as  it  may  lead  problems  in  evaluating  and 
comparing  the  resulting  probabilities  Pr(OlX,),  as  we  will  see  later.  The  K-means 
implementation  is  a  little  faster  than  the  NN  technique,  however  it  is  restricted  to 
codebook  sizes  which  are  powers  of  two.  No  such  restriction  is  needed  for  the  NN 
implementation. 

Note  that  an  initial  selection  of  the  number  of  segments  T  and  the  size  of 
the  codebook  M  may  be  made  by  observing  a  small  selection  of  the  resulting  quantized 
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vectors.  The  basic  idea  is  to  select  a  combination  of  parameters,  which  lead  to  some  type 
of  consistency  in  the  resulting  quantized  vectors  in  a  given  class. 

4.  HMM  Training 

This  section  describes  the  application  of  the  HMM  training  theory  presented  in 
Section  n  to  the  speech  recognition  example.  The  HMM  implementation  uses  MATLAB 
5.3,  and  is  presented  in  Appendix  A.  MATLAB  may  not  be  the  most  desirable  software 
language  as  it  is  relatively  slow,  however  it  was  selected  for  ease  of  implementation  to 
test  the  concepts. 

a)  HMM  Initial  Conditions 

Usually,  a  left-to-right  model  is  preferred  over  the  more  general  ergodic 
one  in  speech  applications  because  model  states  can  be  associated  with  time  in  a 
straightforward  manner  [4].  Thus,  we  selected  a  left-to-right  model,  where  the  matrix  A 
is  upper  triangular.  In  addition,  we  also  prefer  the  model  to  start  from  an  early  state  so 
that  it  can  pass  from  all  possible  states.  As  a  result,  we  didn’t  use  random  values  for  the 
vector  It,  but  instead  we  initialized  the  first  coordinate  to  be  quite  large,  the  second  one 
smaller,  and  so  on,  and  applied  the  constrain  that  the  summation  of  all  values  in  n  is 
equal  to  one.  We  use  random  values  satisfying  the  appropriate  constraint  for  the  matrix 
B. 


b)  Number  of  States  N 

HMM  states  are  called  “hidden”  because  all  information  about  the  state 
sequence  is  not  accessible  directly,  since  the  only  information  available  is  given  by  the 
observations.  According  to  Rabiner  [4],  there  are  two  schools  of  thought  for  the  physical 
meaning  of  the  number  of  states;  the  first  one  states  that  it  represents  the  number  of 
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sounds  (i.e.,  phonemes)  for  each  word,  which  is  usually  a  number  between  2  and  10.  The 
second  interpretation  is  that  it  represents  the  average  number  of  observations  in  a  spoken 
version  of  the  word.  Basically,  we  can  assume  that  the  number  of  states  represents  the 
number  of  distinct  sounds  (e.g.,  phonemes  or  syllables)  of  the  word.  In  our  case  a 
number  of  states  N  equal  to  4  was  deemed  appropriate,  since  our  vocabulary  contains 
only  4  words. 

Further  investigations  may  be  required  when  applying  HMMs  to  other 
types  of  signals  if  we  cannot  relate  the  number  of  states  to  some  physical  meaning  behind 
the  data.  At  worse,  the  number  of  states  can  be  selected  by  trial  and  error  at  that  leading 
to  the  highest  recognition  results.  Note  that  selecting  too  small  a  number  of  states  may 
prevent  from  differentiating  between  classes.  Simulations  showed  that  selecting  too  large 
a  number  of  states  may  result  in  overly  long  training  time  and  numerical  instability  in  the 
implementation  which  cannot  be  controlled  with  scaling. 

c)  HMM  Re-estimation 

The  HMM  re-estimation  iterative  scheme  implemented  uses  the  multiple 
observation  sequence  technique  described  earlier  in  Section  2.  The  MATLAB  software 
implementation  is  given  in  Appendix  A.7  to  A.9.  First,  we  re-estimate  the  model 
parameters  for  every  different  trial,  and  then  we  apply  these  results  in  the  re-estimation 
formula  in  eqs  (2.52)  and  (2.53). 

The  HMM  re-estimation  step  uses  the  scaled  forward  and  backward 
variables  to  avoid  numerical  instabilities,  and  its  implementation  is  given  in  Appendix 
A.6.  In  addition,  we  ran  all  MATLAB  files  using  a  scaled  fixed  point  format  with  15 
digits  (format  long).  Since,  some  computations  still  resulted  in  “divided  by  zero”  errors 
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after  applying  these  corrective  measures.  We  forced  the  denominator  quantities  to  be 
equal  to  the  smallest  floating  point  number  whenever  there  were  found  to  be  smaller. 

We  used  multiple  iterations  for  each  single  observation  HMM  re¬ 
estimation  step  until  convergence  was  reached.  We  noticed  that  the  model  doesn’t 
necessarily  converge  to  the  same  parameters  for  the  same  observation,  when  repeating 
the  re-estimation  procedure  with  random  initial  conditions.  However,  correct  decision  is 
still  achieved. 

d)  Scoring 

The  system  is  ready  to  test  any  word  and  categorize  it  as  one  the  four  class 
types  after  the  four  models  k={  1,2,3 ,4}  derived  separately  for  each  word  (Microsoft, 
Statistics,  Instructor  and  Professor)  are  identified.  We  repeat  the  same  feature  extraction 
scheme  for  the  testing  words,  using  either  the  neural  network  or  the  codebook  derived  in 
the  K-means  algorithm  during  the  training  process.  Let’s  assume  that  the  observation  of 
the  testing  word  is  O.  During  testing,  we  evaluated  the  probability  Prbw(OlA,^^)  for 
k=l,...4,  using  either  the  Baum-Welch  or  the  Viterbi  algorithm,  and  selected  the  model 
which  gives  the  highest  probability  Pr(OlX,^^)  (Appendix  A.  10,  A.  1 1). 
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Classification  of  testing  words  using  all  models  with  parameters: 


model:  M=4.  N=8,  T=7 


-2000  -1600  -1000  -500  0  -3000  -2500  -2000  -1500  -1000  -500  0 


‘1200  -1000  -800  -600  -400  -200  0  -250  -200  -150  -100  -50  0 


PewtdBl 

Tested  words:  1:  Microsoft,  2:  Statistics,  3:  Instructor,  4:  Professor 
Class  models:  Microsoft,  X^^^:  Statistics,  X^^:  Instructor,  X^'^^iProfessor 

PBw[dB]:  101ogioPrbw(0|X) 

PviterbifdB]:  101ogioPrv(0|X) 

:  Correct  Decision  (100%  success) 

Figure  3.5  Scoring  of  all  4  testing  words  (1:  Microsoft,  2:  Statistics,  3:  Instructor,  4: 
Professor)  given  each  model  Parameters;  M=4,  N=2,  T=7.  This  system 
performed  100%  successful  decisions. 

Figure  3.5  presents  the  results  obtained  when  the  number  of  segments  T  is 
equal  to  4,  the  number  of  symbols  M  is  equal  to  4  ( computed  with  the  LGB  algorithm), 
and  the  number  of  states  N  is  equal  to  8.  The  four  horizontal  bars  contained  in  each  word 
represent  the  probabilities  101ogio(P(0|X^*^^)),  k=l,...4.  Thus,  the  highest  probability  is 
that  closer  to  the  OdB  point,  which  represents  the  point  of  probability  1.  The  left  column 
plots  represent  the  results  obtained  with  the  Baum-Welsh  algorithm  while  the  right  colum 
plots  represent  those  obtained  with  the  Viterbi  algorithm.  Recall  that  the  four  models 
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andX(^>  were  trained  with  three  trials  for  every  class.  The  software 
implementation  for  this  testing  step  is  given  in  Appendix  A.  11.  Results  show  that  correct 
classification  is  obtained  in  all  32  cases,  as  shown  in  Figure  3.5. 


Classification  of  all  testing  words  using  all  models  with 
model:  parameters:  M=4,  N=2  (inappropriate),  T=7 


-100  -80  -60  -40  -20  0  .  -100  -80  -60  -40  -20  0 


-1200  -1000  -800  -600  -400  -200  0  -200  -ISO  -100  -50  0 
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B  1 

_ -J _ 1 _ 1 - 1 

-1200  -1000  -800  -600  -400  -200  0  -200  -150  -100  -50  0 


Tested  words:  1:  Microsoft,  2:  Statistics,  3:  Instructor,  4:  Professor 
Class  models:  Microsoft,  X®:  Statistics,  X®:  Instructor,  X^^^:Professor 

PewldB]:  101og,oPrbw(0|X) 

PviteibildB]:  101ogioPrv(0|X) 


:  Correct  Decision 


:  Wrong  Decision 


Figure  3.6  Scoring  obtained  for  all  four  testing  words  (1:  Microsoft,  2:  Statistics,  3: 
Instructor,  4:  Professor)  given  each  model  Model  Parameters  selected:  M=4, 
N=2,  T=7.  This  HMM  has  too  few  states  (N=2)  and  cannot  classify  the  word 
‘instructor’  correctly. 
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Figure  3.6  presents  the  results  obtained  when  selecting  too  small  a  number 
of  states  (N=2).  All  other  parameters  are  kept  the  same  as  those  in  Figure  3.6.  Note  that 
the  algorithm  reaches  the  wrong  decision  for  the  word  “instmctor”  which  was  found  to  be 
“professor.” 

B.  CONCLUSIONS 

This  section  presented  an  HMM-based  classifier  applied  to  a  simple  4-word 
recognition  problem.  We  implemented  and  tested  the  classifier  using  MATLAB,  and 
described  how  we  selected  the  various  parameters  which  need  to  be  selected  to  set-up  the 
classifier.  Next  Chapter  considers  the  application  of  the  HMM-based  classifier  to 
seismo-acoustic  signals  to  differentiate  between  two  types  of  mine-like  objects  buried  in 
shallow  sand. 
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IV.  HMM-BASED  CLASSIFICATION  OF  SEISMO-ACOUSTIC  MINE 

SIGNALS 

HMMs  can  be  applied  to  various  types  of  classification  problems.  This  Section 
presents  the  results  obtained  for  the  classification  of  mine-like  objects  buried  in  sand.  The 
mine  data  was  obtained  from  Prof.  Muir  who  heads  a  buried  mine  detection  project 
started  in  November  1996  at  the  Naval  Postgraduate  School.  Initial  results  obtained  are 
described  in  Gagham  [10],  Fitzpatrick  [11],  and  Hall  [12]. 

The  NPS  mine  project  is  a  continuation  of  work  started  at  the  Applied  Research 
Laboratory  of  the  University  of  Texas  at  Austin,  and  is  sponsored  by  the  Office  of  Naval 
Research.  The  main  goal  of  the  project  is  to  study  the  development  of  a  seismo-acoustic 
sonar  for  the  detection  of  buried  ordnance  using  guided,  seismic  interface  waves.  Earlier 
ARL  and  NPS  results  showed  that  seismic  interface  (Rayleigh)  waves  can  be  used  to 
detect  mine-like  objects  buried  in  sand  [10-12].  The  goal  of  the  current  NPS  program  is 
to  develop  an  improved  seismic  source  to  evaluate  the  feasibility  of  using  a  seismo- 
acoustic  sonar  to  detect  buried  ordnance  in  the  beach  and  surf  zones.  The  seismic  waves 
were  generated  by  two  actuators,  and  were  measured  by  two  three-axis  sensors- 
geophones.  Buried,  mine-like  objects,  ranging  from  71kg  to  290kg,  and  at  ranges  of  up 
to  5  meters  were  echo-located  by  applying  a  basic  polarization  filtering  signal  processing 
scheme. 

This  section  applies  the  HMM  concepts  derived  earlier  to  distinguish  between  two 
types  of  mine-like  objects.  Two  types  of  experiments  were  conducted:  1)  classification  of 
two  different  mine-like  objects  at  the  same  range  with  multiple  weights  for  each  mine 
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type;  and  2)  classification  of  two  different  mine-like  objects  at  3  different  ranges  with  the 
same  weight  for  each  mine  type. 

A.  BASIC  EXPERIMENT  INFORMATION 

This  section  does  not  present  any  theoretical,  physical  or  experimental  details  for 
this  project,  as  they  may  be  found  in  in  [10-12].  However,  we  do  provide  some  basic 
information  regarding  the  physical  nature  of  the  signals  under  study. 

The  beach  site  used  for  collecting  the  data  is  a  stretch  of  U.S.  Navy-owned  beach 
directly  seaward  of  NFS,  Monterey,  CA,  and  shown  in  Appendix  B.2.  This  area 
measured  roughly  150  feet  in  length  running  parallel  to  the  waterline  and  varied  from  20 
to  50  feet  from  the  high-to-low  water  mark.  Generally,  the  sand  conditions  varied  due  to 
the  different  waterline  distance,  resulting  in  changes  on  some  of  the  sand  characteristics, 
such  as  density  and  moisture  in  multiple  depth  layers. 

The  equipment  configuration  for  the  mine  detection  is  illustrated  in  Appendix 

B. 6.  Basically,  two  actuators  produce  the  Rayleigh  waves  described  in  [11].  Rayleigh 
waves  have  distinctive  features  that  make  them  identifiable  in  a  complex  seismo-acoustic 
wavefield.  The  most  important  property  of  Rayleigh  waves  is  the  elliptical  particle 
motion  produced  by  their  passage.  Their  unique  characteristic  is  the  90°  phase  shift 
between  their  horizontal  and  vertical  components,  which  results  in  elliptical  motion,  as 
illustrated  in  Appendix  B.7.  Thus,  the  placement  and  operation  of  the  actuators,  as 
illustrated  in  Figure  4.1,  can  be  categorized  into  two  different  configurations:  1)  actuators 
operated  horizontally;  and  2)  actuators  operated  vertically,  with  a  90°  relative  phase 
difference  between  their  drive  signals  [11]. 
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Figure  4.1  Actuator  placement  for  Rayleigh  waves  generation  [11]. 


The  generated  seismo-acoustic  wave  travels  through  the  geophones  and  meets  the 
target,  as  illustrated  in  Appendices  B.6  and  B.8.  Next,  the  reflected  wave  passes  through 
the  geophones  again,  and  the  received  signal  is  used  to  detect  the  target  presence.  The 
signal  is  collected,  as  described  in  [12],  and  the  signals  for  the  relative  x,  y,  and  z-motion 
are  shown  in  Appendix  B.5. 

Note  that  the  target  reflection  is  not  visible  from  the  raw  signals.  Thus,  vector 
polarization  (VP)  [12]  was  used  to  extract  the  Rayleigh  waves  from  the  raw  information. 
The  VP  step  takes  advantage  of  the  90°  phase  shift  between  vertical  and  radial 
components  to  pull  the  target  information  out.  This  is  accomplished  by  applying  the 
Hilbert  Transform  to  the  radial-vertical  signal  pair  [11],  the  complex  crossed-power  is 
computed  such  as  [12]: 

Prv=fHabert*X'’hilber,  (4-1) 

the  imaginary  component  of  Prv  is  essentially  proportional  to  the  intensity  of  the  seismic 
wave  due  to  the  90°  phase  shift  between  vertical  and  radial  components  [12].  The 
polarity  of  the  imaginary  power  is  associated  with  the  rotation  of  the  elliptical  motion  of 
the  wave  (e.g.,  a  negative  value  corresponds  to  an  anticlock-wise  motion).  Real  and 
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imaginary  components  of  the  cross-power  Prv  are  illustrated  in  Appendix  B.9.  Appendix 
B.8  shows  the  incident  wave  passing  the  geophone  at  lOft.  The  reflected  target  signal 
strength  is  quite  small,  and  we  need  to  expand  the  scale  to  see  it,  as  explained  further 
later. 

Two  mine-like  objects  were  used;  1)  a  cylinder  weighing  1501b  (68kg),  5ft  long, 
8in  (20cm)  in  diameter,  with  a  ‘/4-in  (0.6cm)  wall  thickness;  and  2)  a  U.S.  Navy  power 
can  or  “power  keg”  with  the  shape  of  a  cylindrical  sheet  metal  can  18in  (46cm)  high  and 
24in  (61cm)  in  diameter,  weighing  161b  (7kg).  These  objects  are  shown  in  Appendix 
B.3.  Additional  weights  were  used  to  vary  the  objects  total  weight.  These  additional 
weights  made  the  maximum  weight  of  the  cylinder  and  the  keg  equal  to  6181b  (280kg), 
and  6401b  (290kg)  respectively.  The  cylinder  was  always  buried  on  its  side  with  the 
cylindrical  axis  horizontal  and  in  the  direction  of  the  wave  propagation.  The  powder  keg 
was  always  buried  upright,  with  the  cylindrical  axis  vertical  to  the  direction  of  the  wave 
propagation.  Further  details  are  given  in  Appendix  B.4. 

B.  SIGNAL  SELECTION 

Recall  that  the  target  signal  is  visible  after  processing  the  raw  information  using 
VP.  Thus,  we  use  the  signals  obtained  from  the  imaginary  portion  of  the  cross  power  for 
our  application  only.  We  also  focus  on  the  experiments  conducted  on  the  6  and  10  of 
November  1998,  were  the  cylinder  and  the  keg  targets  were  used,  as  shown  in  Appendix 
B.6.  Two  parameters  were  varied  during  these  two  days  in  addition  to  the  sea  conditions: 
target  weights  and  distances  from  the  geophones.  Ten  trials  were  conducted  for  each 
experiment.  The  distances  between  the  different  pieces  of  the  equipment  (actuators, 
geophones,  target)  were  set  as  shown  in  Appendix  B.6.  As  a  result,  the  actuator- 
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geophone  distance  was  set  at  10ft,  the  actuator-target  distance:  16ft,  and  the  actuator- 
target-geophone  (reflected)  was  22ft  for  both  targets  (cylinder  and  keg).  Appendix  B.8 
shows  the  average  imaginary  cross  power  obtained  for  all  trials  associated  with  a  specific 
target  and  weight,  scaled  near  the  distance  of  22ft.  Appendix  B.9  shows  that  the  reflected 
target  signal  becomes  stronger  as  the  target  mass  increases.  Appendices  B.IO  and  B.  11 
plot  the  imaginary  cross-power  signals  obtained  from  the  cylinder  and  the  keg  targets 
respectively.  We  used  the  signal  from  the  2“^  geophone  for  the  cylinder  case,  as  it  is  the 
strongest  of  the  two,  and  the  signal  from  the  1*‘  geophone  for  the  powder  keg,  as  the  2”^ 
geophone  doesn’t  show  any  received  signal.  However,  the  signal  measured  at  the  2“^ 
geophone  for  the  keg  experiment  is  used  as  an  example  of  background  noise  (no  target) 
signal.  This  background  noise  is  used  as  part  of  the  testing  data,  as  described  later. 

C.  SIGNAL  PREPARATION 

Plots  contained  in  Appendix  B.IO  and  B.ll  show  there  is  some  consistency 
between  the  signals  obtained  from  multiple  weights  of  the  same  target.  Thus,  we  elected 
to  consider  all  signals  associated  with  a  given  target  to  the  same  class,  independent  of  the 
weight.  Five  trials  were  used  for  training,  and  the  6*  trial  was  used  for  testing.  The 
following  weights  were  used  for  the  keg  target:  1)  2241b;  2)  4321b;  and  3)  5361b. 
Similarly,  the  following  weights  were  used  for  the  cylinder  target:  1)  3641b;  2)  4681b  3) 
5721b.  Next,  we  assumed  that  we  detected  the  presence  of  a  non-background  signal 
using  some  type  of  threshold  detector,  and  truncated  the  signals  used  for  the  HMM  set-up 
around  the  position  of  the  target  location.  Signals  used  for  the  HMMs  training  and 
classification  are  shown  in  Figure  4.2.  Note  that  it  would  be  useless  to  include  the 
incident  signal  received  at  a  geophone  from  the  actuators,  as  it  doesn’t  contain  any 
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information  about  the  target,  and  is  much  stronger  than  the  signal  reflected  from  the 
target.  Each  signal  is  400  data  points  long.  Two  classes  (keg  and  cylinder)  are 
generated.  At  this  point,  the  goal  is  to  recognize  the  shape  of  the  target,  and  each  class 
needs  to  be  characterized  by  a  set  of  features,  as  explained  next. 


Powder  Keg  2241b 


Powder  Keg  4321b 


Cylinder  364b 


Cy6nder468b 


Cylinder  572b 


0  100  2CX)  300  400 

Data  Points 


Figure  4.2  Imaginary  component  cross-power  for  the  keg  and  cylinder  targets  for  3 
different  weights,  truncated  near  the  target  location.  Each  plot  illustrates  6  different  trials 
(for  the  same  target,  and  weight) 

D.  FEATURES  EXTRACTIONATECTOR  QUANTIZATION 

Note  in  Figure  4.2  that  the  signals  contain  little  spectral  information.  The 
frequency  of  the  Rayleigh  waves  obtained  during  the  November  6*  and  10*^  experiments 
was  80Hz.  As  a  result,  the  signals  contain  very  little  useful  spectral  information.  We 
tried  to  apply  LPC  feature  extraction,  and  other  similar  spectral  coefficients,  but  we 
found  little  or  no  consistency  between  the  trials  of  a  given  class.  Therefore,  we  simply 
used  the  time-domain  information  directly.  We  segmented  the  signals  into  two  segments 
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(T=2),  and  decimated  each  frame  by  a  factor  of  20:1,  resulting  in  10  data  points  over  each 
segment,  as  shown  in  Appendix  C.4.  Finally,  we  normalized  those  points  to  compensate 
for  the  differences  in  signal  strength  coming  from  different  weights,  by  dividing  every 
value  with  the  maximum  one. 

We  applied  the  LGB  algorithm  at  the  VQ  stage  and  selected  M=8  symbols.  The 
code  implementation  is  given  in  Appendix  C.3. 

E.  HMM  TRAINING  FOR  THE  MULTIPLE  WEIGHTS  EXPERIMENT 

HMMs  need  data  to  be  trained.  However,  in  practice  the  availability  of  data  may 
be  seriously  limited  for  various  reasons  and  cross-validation  needs  to  be  used  [13]. 

This  study  has  only  a  limited  number  of  trials  available:  six  trials  for  every  type  of 
mine-like  object.  Thus,  we  used  cross-validation  by  successively  selecting  five  trials  for 
training  and  the  last  one  for  testing.  The  procedure  was  repeated  six  times  rotating  the 
testing  signal  each  time.  The  code  implementation  is  given  in  Appendices  C.5  and  C.6. 
We  found  that  the  best  performance  is  achieved  with  the  number  of  HMM  states  N  equal 
to  four. 

We  created  two  ergodic  HMM  models:  one  for  the  keg  class  and  one  for  the 
cylinder.  Thus,  we  used  a  total  number  of  5(trials)x3(types  of  weight)=15  signals  for  the 
HMM  training  for  every  class  (i.e.,  model).  We  selected  an  ergodic  model  as  it 
performed  better  than  the  left-to-right  model  for  the  data. 

F.  MULTIPLE  WEIGHTS  SCORING  AND  RESULTS 

The  MATLAB  file  sequence.m  given  in  Appendix  C.6  performs  all  HMM 
training  and  scoring.  Note  that  the  testing  signal  is  rotated  each  time,  and  the  results 
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plotted,  as  shown  in  Figure  4.3a  (cylinder  training  case)  and  in  Figure  4.3b  (keg  training 
case). 


Pr[0|lamcla(cy«nder)]  - 1  2:keg^32|^,  3:keg53g|^,  4:cylinder3g^u,.  5:cylinder^gg,^.  ercylnderj^,  7:background,  M=8.  N=4,  T=2 


■50  -40  -30  -20  -10  0 


■40  -30  -20 


-40  -30 


-10 


-20  -10 


«!dB] 


PewldB]:  101ogioPrbw(Oi;i) 


PvitcrbiidB]:  101og,oPrv(0|;^) 

<  -  ;  Correct  Decision 

<  X  ;  Wrong  Decision 

Figure  4.3a  HMM  testing  results  for  the  cylinder  model  (multiple  weights). 

Prbw(0|A,cyiinder)  and  Prv(0|A,cyUnder)  for  all  testing  signals  and  all  rotations  (every  row).  The 
model  A,cyiinder  is  created  by  training  all  cylinder  signals  (5x3=15).  The  decision  is  correct 
whenever  the  tested  signals  (4,5,6)  have  a  higher  Probability  (i.e.,  closer  to  0  on  a  dB 
scale)  than  1,  2,  3,  or  7  (keg  signals  and  background).  Each  row  represents  one  of  the  six 
iterations  of  the  rotation  between  testing  and  training  signals. 
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Pbw[c1B]:  101og,oPrbw(0|X) 

Pviteii)i[dB]:  lOlogioPrv(OI^) 

<  -  :  Correct  Decision 

<  ^  :  Wrong  Decision 

Figure  4.3b  HMM  testing  results  for  the  keg  model  (multiple  weights). 

Prbw(0|Xkeg)  and  Prv(0|)ikeg)  for  all  testing  signals  and  all  rotations  (every  row).  The 
model  Xkeg  is  created  by  training  all  keg  signals  (5x3=15).  The  decision  is  correct 
whenever  the  tested  signals  (1,  2,  3)  have  a  higher  probability  (i.e.,  closer  to  0  on  a 
dB  scale)  than  4,  5,  6,  or  7  (cylinder  signals  and  background).  Each  row  represents 
one  of  the  six  iterations  of  the  rotation  between  testing  and  training  signals. 

We  used  as  background  (non  target)  signals,  the  six  trials  obtained  from  the  2“*^ 
geophone  during  the  keg  experiment  which  didn’t  track  any  target  signal.  Figures  4.3a 
and  4.3b  show  that  an  decision  error  for  one  case  only.  The  signal  considered  in  that  set¬ 
up  is  the  4*^  tested  signal,  i.e.,  the  cylinder  with  weight  of  3681bs.  Figure  4.2  shows  that  it 
is  the  weakest  signal  of  all  the  mine  signals.  The  total  number  of  testing  data  is 
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2(mines)x3(weights)x6(rotations)=36,  thus  the  overall  classification  performance  of  the 
system  is  (36-l)/36  =97.2%. 

G.  MULTIPLE  DISTANCES  EXPERIMENT 

Up  to  this  point  we  showed  that  we  can  recognize  the  shape  of  the  mine-like 
objects  for  a  fixed  distance.  Next,  we  apply  the  HMM-based  classifier  to  recognize  these 
objects  located  at  three  different  distances  from  the  actuator,  as  we  need  to  be  able  to 
detect  and  recognize  a  mine  independent  of  the  distance  from  the  sonar  equipment  for  a 
more  realistic  set-up.  Experiments  conducted  on  the  6*  and  the  10*  of  November  1998 
provide  the  following  data  by  moving  the  location  of  the  geophones  between  the  actuator 
and  the  target  which  stays  fixed  at  16ft,  as  illustrated  in  Appendix  B.6.  Thus,  the 
following  three  set-ups  are  available: 

•  Distance  between  actuator  and  geophone  equal  to  6ft,  resulting  in  the  total 
distance  (actuator-geophone-target-geophone)  equal  to  26ft, 

•  Distance  between  actuator  and  geophone  equal  to  8ft,  resulting  in  the  total 
distance  (actuator-geophone-target-geophone)  equal  to  24ft. 

•  Distance  between  actuator  and  geophone  equal  to  10ft,  resulting  in  the  total 
distance  (actuator-geophone-target-geophone)  equal  to  22ft. 

Note  that  the  reflected  signals  obtained  for  the  cylinder  for  total  distances  equal  to  22ft 
and  24ft  were  very  weak  (in  fact,  same  range  as  that  of  the  background  noise).  Thus,  we 
used  this  data  for  the  powder  keg  case  only. 

We  created  two  classes  again.  The  first  class  contains  the  keg  data  with  weight 
2241bs  obtained  for  the  6ft,  8ft,  and  10ft  experimental  set-ups,  and  6  trials  at  each 
distance.  The  second  class  is  composed  of  the  cylinder  data  with  weight  3641bs  for  the 
experimental  set-up  of  10ft.  Six  trials  are  also  available  for  the  experimental  set-up.  All 
signals  were  segmented  again  around  the  target  location,  as  done  before,  resulting  in 
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signals  with  length  equal  to  400  points,  as  shown  in  Figure  4.4.  Again,  Figure  4.4  shows 
that  there  is  also  a  viewable  consistency  for  the  mine-like  keg  object.  No  conclusion  can 
be  drawn  for  the  cylinder,  as  the  signal  was  too  weak  to  be  usable. 

We  used  the  same  feature  extraction  as  that  considered  earlier  for  the  multiple 
weight  experiment  (2  segments,  decimation,  VQ,  number  of  symbols  M=8),  and  the 
implementation  is  presented  in  Appendix  C.9.  The  model  selected  is  ergodic  with  four 
states  (N=4).  Cross-validation  was  used  again  due  to  the  limited  amount  of  data 
available.  Scoring  results  obtained  for  the  keg  class  model  and  the  cylinder  class  model 
are  presented  in  Figures  4.5a  and  4.5b  respectively.  The  MATLAB  files  implementations 
for  the  HMM  training/testing  using  multiple  distances  case  are  presented  in  Appendices 
C.IO,  C.ll,andC.12. 


Figure  4.4  Imaginary  component  of  the  cross-power  for  3  different  distances  for  the  keg 
target  and  one  distance  for  the  cylinder.  Each  plot  illustrates  6  different  trials  (for  the 
same  target,  and  distance). 
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PewEdB]:  101ogioPrbw(0|A,) 


Pviierbi[<iB]:  10IogioPrv(0|A,) 

< - :  Correct  Decision 

Figure  4.5a  Prbw(0|Xkeg)  and  Prv(0|A,keg)  for  all  testing  signals  and  all  rotations 
(every  row).  The  model  Xkeg  is  created  by  training  all  keg  signals  (5x3=15).  The  decision 
is  correct  whenever  the  tested  signals  (1,  2,  3)  have  a  higher  probability  (i.e.,  closer  to  0 
on  a  dB  scale)  than  4,  or  5  (cylinder  signals  and  background).  Each  row  represents  one  of 
the  six  iterations  of  the  rotation  between  testing  and  training  signals. 
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Pbw[<1B]:  101ogioPrbw(0|X) 


Pviterbi[<JB]:  101ogioPrv(0|;^) 

< - :  Correct  Decision 

Figure  4.5b  Prbw(0|A,cyiinder)  and  Prv(0|Xcyiinder)  for  all  testing  signals  and  all 
rotations  (every  roAv).  The  model  XcyUnder  is  created  by  training  all  cylinder  signals 
(5x1=5).  The  decision  is  correct  whenever  the  tested  signals  (4)  have  a  higher  probability 
(i.e.,  closer  to  0  on  a  dB  scale)  than  1,  2,  3,  or  5  (keg  signals  and  background).  Each  row 
represents  one  of  the  six  iterations  of  the  rotation  between  testing  and  training  signals. 

Figures  4.5a  and  4.5b  show  that  the  system  performs  100%  correct  detection, 
which  seems  to  indicates  that  the  HMMs  took  advantage  of  the  consistency  of  the  signals 
at  different  distances.  However,  additional  data  is  needed  to  make  further  conclusions. 


H.  CONCLUSIONS 

This  section  described  a  HMM-based  mine-like  object  classification  system  using 
the  seismo-acoustic  waves  provided  by  the  NPS  project  [10-12].  Initial  results  indicate 
that  the  HMM-based  classifier  can  recognize  the  type  of  mine-like  object,  independent  of 
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the  object  weight  with  a  97%  accuracy.  Results  also  indicate  that  it  can  recognize  the 
object  type  at  different  distances  with  a  100%  accuracy.  However,  the  experiments  were 
conducted  with  very  few  data,  and  further  work  needs  to  be  done  to  confirm  these  initial 
findings  by  using  a  larger  data  set. 
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V.  MINE-LIKE  OBJECT  RECOGNITION  USING  NEURAL  NETWORKS 


This  chapter  considers  the  application  of  a  back-propagation  neural  network 
classifier  to  the  same  data  as  that  considered  earlier  with  the  HMM-based  classifier.  The 
goal  of  this  chapter  is  to  compare  the  resulting  performances  obtained  with  the  two 
different  implementations. 

A.  NEURAL  NETWORK  DESCRIPTION 

NN  input  feature  vectors  are  those  obtained  after  the  VQ  step  described  earlier  in 
Chapter  HI.  Thus,  we  have  two  signal  classes  again:  the  keg  and  the  cylinder  class.  We 
select  a  back-propagation  feed-forward  NN  (BPNN)  with  one  hidden  layer  and  two 
outputs  (one  for  each  class  of  mine-like  objects).  Note  we  do  not  specific  a  target  output 
for  the  background  signal  as:  1)  there  is  no  consistency  between  the  different  background 
signals  available,  and  2)  we  do  not  use  this  classification  technique  to  confirm  a  target 
existence. 

The  network  specific  structure  is  described  in  Figure  5.1.  BPNNs  use  the  steepest 
descent  algorithm,  or  variants  of  it,  to  find  the  weights  which  minimize  the  squared  error 
between  target  and  network  outputs  [2,14].  BPNN  may  use  one  or  more  hidden  layers  of 
sigmoid  neurons,  and  one  output  layer  of  linear  neurons[14]. 
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Input  Hidden  Layer  Output  Layer 


Sh:  #  neurons  in  the  hidden  layer  (in  our  case  Sh=60) 
So:  #  neurons  in  the  output  layer  (in  our  case,  So=l) 
a‘=logsig(W,.ip'+b‘) 
a^=purelin(W2,ia*+b^) 


Figure  5.1  Hidden  and  output  layer  of  neurons  of  a  backpropagation  feedforward 
neural  network. 

B.  MULTIPLE  WEIGHTS  SET-UP 

We  used  the  observations  from  the  3  different  weights  for  each  mine-like  object 
(keg  and  cylinder),  5  trial  each,  as  input  vectors  p  to  train  the  neural  network.  We  used 
the  numbers  “1”  and  “2”  as  targets  for  the  neural  network,  such  as  “1”  for  the  keg  class, 
and  “2”  for  the  cylinder  class  (see  Appendix  D.l  for  the  MATLAB  code).  Finally,  we 
tested  the  6“*  trial  of  each  signal  by  computing  its  output  value  to  the  trained  NN.  The 
test  signal  is  recognized  as  a  “keg”  signal  when  the  output  is  close  to  1,  and  it  is 
recognized  at  a  “cylinder”  signal  when  it  is  close  to  2.  Note  that  the  NN  output  is  not 
binary,  so  we  set  a  range  around  output  values  1  and  2,  around  which  the  data  is  said  to 
belong  to  one  class  or  the  other.  The  specific  range  is  set  at  3%  above  or  below  the  target 
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output  values  1  and  2.  The  data  is  considered  as  not  belonging  to  either  class  (other) 
when  the  NN  output  value  is  outside  that  range. 

We  also  tested  the  background  signal  defined  for  the  HMM-based  classifier.  We 
rotated  training  and  testing  observations  six  times,  as  done  with  the  HMM-based 
classifier.  Results  for  the  multiple-weights  case  are  given  in  Table  5.1.  Note  that  the 
system  recognizes  all  testing  signals,  except  the  first  background  signal,  which  is  detected 
as  a  cylinder-class  signal  because  the  associated  NN  was  equal  to  1.999  (within  the  ±3% 
range  of  the  target  output  value  2  associated  to  the  cylinder-class).  Thus,  the  overall 
recognition  performance  is  97%,  as  the  total  number  of  testing  signals  is  42  (7signals  x  6 
rotations). 
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Table  5.1  Multiple  weights  NN  outputs/decisions 

Testing  Signal 


Rotation 

Keg2241b 

Keg432ib 

Keg5361b 

Cyl3641b 

Cyl4681b 

Cyl572Ib 

Backgr 

1 

0.9999 

0.9999 

0.9999 

1.9999 

1.9999 

1.9999 

1.9999 

V 

V 

V 

V 

V 

V 

X(Cyl) 

2 

0.9831 

0.9958 

0.9958 

1.9962 

1.9780 

1.9780 

1.1827 

V 

V 

V 

V 

V 

V 

V 

3 

1.0043 

0.9993 

0.9993 

1.9962 

1.9962 

1.9962 

1.3478 

V 

V 

V 

V 

V 

V 

V 

4 

1.0001 

0.9999 

0.9999 

1.9962 

1.9996 

1.9996 

0.7544 

V 

V 

V 

v 

V 

V 

V 

5 

0.9999 

1.0000 

1.0000 

1.9999 

1.9999 

1.9996 

I 

2.4937 

V 

V 

V 

V 

V 

V 

V 

6 

1.0117 

0.9993 

0.9993 

1.9977 

2.0053 

2.0053 

0.2342 

V 

V 

V 

V 

V 

V 

V 

V:  Correct  Decision 
X:  Wrong  Decision 
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C.  MULTIPLE  DISTANCES  SET-UP 


Similarly,  the  NN  is  used  for  the  multiple  distance  set-up,  as  considered  earlier  in 
Section  IV.  Results  are  presented  in  Table  5.2,  and  the  code  is  given  in  Appendix  D.2. 

Table  5.2  Multiple  distances  NN  outputs/decisions 


Testing  Signal 

Rotation 

Kegfift 

Kegsft 

Kegs  10ft 

Cylioft 

Backgr 

1 

0.9966 

1.3831 

1.001 

1.9705 

1.0014 

1 

X(back) 

V 

V 

X  (Cyl) 

2 

0.9942 

0.9924 

1.012 

2.012 

2.3552 

1 

y 

V 

V 

V 

V 

3 

1.023 

1.042 

0.9923 

1.999 

1.3245 

V 

V 

V 

V 

V 

4 

1.034 

1.045 

0.9943 

0.9938 

1.5423 

V 

V 

t 

V 

V 

V 

5 

1.003 

1.000 

0.9945 

1.9936 

1.5945 

V 

V 

V 

V 

V 

6 

1.010 

0.9995 

0.9991 

1.9994 

0.2342 

V 

V 

V 

V 

V 

V:  Correct  Decision 
X:  Wrong  Decision 
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Table  5.2  shows  that  all  decisions  are  correct,  except  in  two  cases: 

1)  the  first  trial  of  the  keg  data  at  8ft  gets  detected  as  background, 

2)  the  first  trial  of  the  background  signal  gets  detected  as  cylinder-class  data. 

Thus,  the  overall  classifier  performance  is  93%  for  this  set-up. 

D.  CONCLUSIONS/COMPARISON  WITH  THE  HMM-BASED  CLASSIFIER 

These  results  show  that  the  two  classifiers  performed  in  a  similar  manner.  The 
overall  performance  was  97%  for  the  multiple  weights  set-up,  while  the  HMM-based 
classifier  performance  was  100%,  and  the  NN  93%  for  the  multiple  distance  set-up.  Note 
that  no  background  signal  was  used  during  the  NN  training,  and  that  we  assigned  as 
background  data  which  NN  output  didn’t  fall  into  the  prescribed  range. 

Finally,  we  measure  the  speed  of  the  execution  of  the  MATLAB  file  for  the 
evaluation  of  all  the  results  for  the  multiple  distances  case.  Thus,  in  a  Pentium  EH 
450MHz,  128MB  Ram,  the  execution  time  for  the  HMMs  was  5s,  and  for  the  NN  was 
15s,  thus  the  HMM  was  3  times  faster  that  the  NN  (training  and  testing). 
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VI.  CONCLUSIONS  AND  RECOMMENDATIONS 


A.  CONCLUSIONS 

This  study  presented  an  introduction  to  HMMs  and  their  applications  to 
classification  problems.  HMMs  require  the  selection  of  consistent  characteristics  between 
the  classes  to  perform  well.  In  addition,  many  other  parameters  have  to  be  carefully 
selected  to  set-up  the  HMM  to  have  it  perform  successfully.  We  implemented  the  HMM 
using  MATLAB  and  tested  it  on  a  simple  four  isolated  word  recognition  problem. 

The  HMM-based  classifier  implemented  in  this  study  performed  very  well  on  the 
limited  (two  classes)  mine-like  object  recognition  experiment.  Results  also  show  that  its 
performance  is  similar  to  that  obtained  with  a  BPNN.  However,  the  classifier  needs  to  be 
run  using  larger  data  sets  to  confirm  the  present  findings. 

B.  RECOMMENDATIONS 

The  reflected  signals  from  the  targets  were  quite  weak,  especially  when  the  target 
distance  was  large,  and  the  mine  weight  low.  Furthermore,  the  received  reflected  signals 
from  the  two  geophones  sometimes  had  different  strengths,  (see  Appendix  B.IO),  or 
weren’t  received  by  both  geophones,  (see  Appendix  B.ll).  Thus,  additional  data  is 
required  to  further  the  classification/recognition  process,  and  data  collection  is  in 
progress,  as  of  September  1999  [15].  Finally,  a  more  complicated  HMM-based  set-up 
would  probably  be  needed,  if  we  want  to  train  the  classifier  with  different  types  of  mines, 
possibly  using  a  higher  number  of  segments  (T),  and/or  symbols  (M),  and/or  states  (N). 
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APPENDIX  A.  HIDDEN  MARKOV  MODEL  MATLAB  PROGRAMS 


This  appendix  contains  the  various  MATLAB  programs  used  for  the  HMM-based 
classification  of  isolated  words. 
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APPENDIX  Al.  FEATURES  EXTRACTIONAHECTOR  QUANTIZATION  USING 

NEURAL  NETWORKS 


%  Filename:  training. m 
%  Written  by:  M.  Zambartas 
%  Date  Last  Modified  10  August  1999 

%  Purpose:  Features  Analysis  (LPC  &  Energy)  of  4  words  (Statistics, 
Microsoft,  Instructor, 

%  and  Professor) ,  3  trials  each  +  1  test  word,  vector 

quantization  using  a  competitive  layer  neural  network 
%  for  HMM  classification  use. 


clear 

%  M:  dimension  of  codebook,  N:  number  of  states 

M=4;N=8; 

rand( ' seed' ,  1) ; 

%  the  mat  files  contain  the  word  signals 

%load  s tat .mat ; load  micro. mat, -load  instrprof .mat; 

load  words. mat; 

%  signals  interpolation 

%  coeff  is  the  fuction  that  gives  back  the  LCP  +  energy  coefficients. 

%  T  is  the  number  of  segments 

[cstal,T] =coef f (stal) ; [csta2 , T] =coef f (sta2) ; [csta3 , T) =coef f (sta3) ; [csta 
4,T] =coef f (sta4) ; 

[cmicrol,T] =coeff (microl) ; [cmicro2,T] =coeff (micro2) ; 

[cmicro3,T] =coeff {micro3) ; [cmicro4, T] =coef f (micro4) ; 

[cinstrl,T] =coeff (instrl) ; [cinstr2,T] =coeff (instr2) ; 

[cinstr3,T] =coeff (instr3) ; [cinstrtest , T] =coeff (instrtest) ; 

[cprofl,T] =coeff (profl) ; [cprof2,T] =coeff (prof2) ; 

[cprof3 ,T] =coef f (prof 3) ; [cprof test , T] =coef f (prof test ) ; 


n=l:100; 

%  we  convert  the  results  by  a  constant,  because  the  NN  works  better 
convert =2 000 ; 

cstal=cstal*convert ; cs ta2=csta2* convert ;csta3=csta3*convert;csta4=csta4 
* convert ; 

cmicrol=cmicrol*convert ; cmicro2=cmicro2*convert ;cmicro3=cmicro3*convert 
;cmicro4=cmicro4*convert; 

cinstrl=cinstrl *convert ; cins tr2=cinstr2  *convert ; cinstr3=cinstr3  *convert 
;cinstrtest=cinstrtest*convert; 

cprof l=cprofl*convert; cprof2=cprof2*convert ;cprof3=cprof3*convert ;cprof 
test=cprof test ^convert ; 

%  vector  quantization  with  NN 
pl= [cstal ; csta2 ; csta3 ; 

cmicrol ; cmicro2 ; cmicro3 ; cinstrl ; cinstr2 ; cinstr3 ] ' ; 
prl=minmax(pl)  ;  . 

net=newc  (prl,M,  1)  ; 

net . trainParam. epochs=1000 ; 

net . trainParam. show=l 00 ; 

net=train(net,pl) ; 

w=net . IW{1} ; 
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ys=sim(net,cstal ' ) ; 

%  classsta  is  the  observations  vector  0  for  the  first  trial  of  the  word 

' Statistics ' . 

classsta=vec2ind(ys) ; 

ys2=sim{net , csta2 ' ) ; 

classsta2=vec2ind(ys2) ; 

ys3=siin(net ,  csta3  ' )  ; 

classsta3=vec2ind(ys3) ; 

ys4=siin(net,csta4')  ; 

classstatest=vec2ind(ys4)  ; 

ym=sim{net ,  cmicrol ' )  ; 

classmicro=vec2ind(yItl)  ; 

yin2=sim{net ,  cmicro2  ' )  ; 

classmicro2=vec2ind(ym2) ; 

yin3=sim(net ,  cniicro3  ' )  ; 

classmicro3=vec2ind(ym3)  ; 

yin4=sim(net ,  cinicro4 ' )  ; 

classinicrotest=vec2ind(yin4')  ; 

yi=sim(net , cinstrl ' ) ; 
classinstr=vec2ind(yi)  ; 
yi2=sim(net , cinstr2 ' ) ; 
classinstr2=vec2ind{yi2)  ; 
yi3=sim(net , cinstr3 ' ) ; 
classinstr3=vec2ind(yi3)  ; 
yi4=sim{net , cinstrtest ' ) ; 
classinstrtest=vec2ind(yi4) ; 
yp=sim(net , cprof 1 ' ) ; 
classprof=vec2ind(yp) ; 
yp2=sim(net , cprof2 ' ) ; 
classprof2=vec2ind(yp2) ; 
yp3=siiti(net,  cprof3  ' )  ; 
classprof3=vec2ind{yp3) ; 
yp4=siin  (net ,  cprof  test ' )  ; 
classproftest=vec2ind(yp4)  ; 


%  we  divide  by  the  constant  multiplied  before 

cstal=cstal/convert;csta2=csta2 /convert ;csta3=csta3 /convert ;csta4=csta4 
/ convert ; w=w/ convert ; 

cmicrol=cmicrol/convert;cmicro2=cmicro2 /convert;  cmicro3=cmicro3/ convert 
; cmicro4=cmicro4/ convert ; 

cinstrl=cinstrl/convert ;cinstr2=cinstr2 /convert ;cinstr3=cinstr3 /convert 
; cinstrtest=cinstrtest /convert ; 

cprof l=cprofl /convert ; cprof 2 =cprof 2 /convert ; cprof 3 =cprof 3 /convert ; cprof 
test=cprof test /convert ; 

%  Test  of  vector  quantization 
figured) 

subplot (4, 2 ,1) ;plot (1 : 8, cstal (1, : ) , ' * ' , 1 : 8 , w(classsta (1) / : ) ) , title ( [ 've 
ctorl,Class='  n\r:n2str  (classsta (1) )  '  nuin2str  (M)  ] ) 

subplot (4,2,2) ;plot (1 : 8, cstal (2 , 1 : 8, w( classsta (2) , : ) ) , title ( [ 've 
ctor2,Class='  nurci2str  (classsta (2)  )  ]  ) 

subplot (4,2,3) ;plot (1 : 8 , cstal (3 , : 8 , w(classsta (3 ) , : ) ) , title ( [ 've 
ctor3,Class='  nxain2str (classsta (3) )  ]  ) 

subplot (4, 2 , 4) ;plot(l:8,cstal(4, :),'*' , 1 : 8,w(classsta (4) , :) ) , title ( [ 've 
ctor4,Class=:'  niim2str  (classsta (4) )  ]  ) 
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subplot (4, 2 , 5) ;plot (1 : 8, cstal (5, 1 : 8, w(classsta (5) , : ) ) / title ( [ 've 
ctor5,Class='  num2str (classsta(5) ) ] ) 

subplot (4,2,6) ;plot (1:8, cstal (6,:),'*',!: 8 , w(classsta { 6 ) , : ) ) , title ( [ 've 
ctor6,Class='  num2str (classsta (6) ) ] ) 

subplot (4, 2, 7) ;plot (1 : 8, cstal (7 1 : 8 ,w (classsta (7 ) , : ) ) , title ( [ 've 
ctor7,Class='  num2str (classsta{7) ) ] ) 

subplot (4, 2, 8) ;plot (1 : 8, cstal (8, 1 : 8 ,w (classsta ( 8 ) , : ) ) , title ( [ 've 
ctor8,Class='  num2str (classsta (8) )  ]  ) 
figure (2) 

subplot (4, 2 , 1) ;plot (1 : 8, cmicrol : 8 , w (classmicro ( 1) , : ) ) , title ( 
[ 'vectorl,Class='  nuin2str  (classmicrod)  )  '  M='  nuxa2str  (M)  ] ) 
subplot (4,2,2) ;plot (1:8, cmicrol (2,:),'*',!: 8 ,w (classmicro (2 ) , : ) ) , title ( 
[  'vector2,Class='  nuin2str  (classmicro (2 )  )  ]  ) 

subplot (4, 2,3) ;plot (1:8, cmicrol (3,:),'*',!: 8 , w(classmicro (3 ) , : ) ) , title ( 
[ 'vector3,Class='  num2str (classmicro (3) ) ] ) 

subplot (4,2,4) ; plot (1 : 8, cmicrol (4, 8 , w(classmicro (4) , : ) ) , title ( 
[ 'vector4,Class='  num2str (classmicro (4) ) ] ) 

subplot (4,2,5) ;plot (1:8, cmicrol (5, 8 , w(classmicro (5) , : ) ) , title ( 
[ 'vectors, Class='  num2str (classmicro (5) ) ] ) 

subplot (4, 2, 6) ;plot (1 : 8, cmicrol (6, ;),'*', 1 : 8 , w (classmicro ( 6 ) , : ) ) , title ( 
[ ' vectors, Class='  num2str (classmicro(6) ) ] ) 

subplot (4, 2 ,7) ;plot (1 : 8, cmicrol (7 ,:),'*' , 1 : 8 , w (classmicro (7 ) , : ) ) , title { 
[ ' vector7 , Class= '  num2str (classmicro(7) ) ] ) 

subplot (4, 2,8) ;plot (1:8, cmicrol (8, :),'*',!: 8 , w (classmicro ( 8 ) , : ) ) , title ( 
[ 'vectors, Class='  num2str (classmicro (8) ) ] ) 

%qa (1 ;T, : ) =w( classmicro (1 :T) , : ) ; 

%qa(l:T, : ) =w(classsta (1 :T) , : ) ; 
for  n=l:T 

distances (n) =  dist (cstal (n, : ) , w{classsta (n) ,:)'); 
distancem(n) =  dist (cmicrol (n, : ) , w(classmicro (n) ,:)'); 


end 

figure (3) 

stem(distances) , title { 'Euclidean  Distance  original/quantized  vectors') 
figure (4) 

stem(distancem) , title ( 'Euclidean  Distance  original/quantized  vectors') 
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APPENDIX  A2.  FEATURES  EXTRACTIONATECTOR  QUANTIZATION  USING 

K-MEANS 


%  Filename :  trainingl . m 
%  Written  by:  M/  Zambartas 
%  Date  Last  Modified  10  August  1999 

%  Purpose:  Features  Analysis  (LPC  &  Energy)  of  4  words  (Statistics, 
Microsoft,  Instructor, 

%  and  Professor) ,  3  trials  each  +  1  test  word,  vector 

quantization  using  k-means 
%  for  HMM  classification  use. 


clear 
M=4;N=8; 
n_it=20; 
rand( 'seed' , 1) ; 

%load  stat .mat ; load  micro. mat; load  instrprof .mat ; 
load  words. mat; 

%  signals  interpolation 

%  T  is  the  number  of  segments 

%  sstal,ssta2  ...  coefficients  of  each  trial 

[cstal, T] =coef f (stal) ; [csta2 , T] =coef f (sta2 ) ; [csta3 , T] =coef f (sta3) ; [csta 
4,T]=coeff (sta4) ; 

[cmicrol,T] =coeff (microl) ; [cmicro2,T] =coeff (micro2) ; 

[cmicro3 , T] =coef f (micro3 ) ; [cmicro4,T] =coeff (micro4) ; 

[cinstrl, T] =coef f (instrl) ; [cinstr2 , T] =coef f {instr2 ) ; 

[cinstr3 , T] =coeff (instr3 ) ; [cinstrtest , T] =coef f (instrtest) ; 

[cprofl,T] =coef f (profl) ; [cprof2 , T] =coef £ (prof 2) ; 

[cprof3,T] =coeff (prof3) ; [cproftest,T] =coeff (prof test) ; 


%  vector  quantization 
pl= [cstal; csta2 ;csta3 ; 

cmicrol; cmicro2 ;cmicro3 ; cinstrl; cinstr2 ; cinstr3] ; 

%  CodeBook  creation: 

[CODE,  label,  dist]  =  svq(pl,  M,  n_it) ; 

%  Vector  quantization 

[w  classsta,  dist]  =  vq(cstal,  CODE,  n__it)  ; 

[w2,  classsta2,  dist]  =  vq(csta2,  CODE,  n_it) ; 

[w3  classsta3,  dist]  =  vq(csta3,  CODE,  n_it) ; 

[wt,  classstatest ,  dist]  -  vq{csta4,  CODE,  n_it) ; 

[w  classmicro,  dist]  =  vq(cmicrol,  CODE,  n_it) ; 

[w  classmicro2,  dist]  =  vq(cmicro2,  CODE,  n_it) ; 

[w  classmicro3,  dist]  =  vq(cmicro3,  CODE,  n_it) ; 

[w  classmicrotest ,  dist]  =  vq{cmicro4,  CODE,  n_it) ; 

[w  classinstr,  dist]  =  vq (cinstrl,  CODE,  n_it) ; 

[w  classinstr2,  dist]  =  vq{cinstr2,  CODE,  n_it) ; 

[w  classinstr3,  dist]  =  vq(cinstr3,  CODE,  n_it) ; 

[w  classinstrtest ,  dist]  =  vq{cinstrtest ,  CODE,  n__it)  ; 
[w  classprof,  dist]  =  vq(cprofl,  CODE,  n_it) ; 

[w  classprof2,  dist]  =  vq(cprof2,  CODE,  n_it) ; 

[w  classprof3,  dist]  =  vq(cprof3,  CODE,  n_it) ; 

[w  classproftest,  dist]  =  vq(cprof test,  CODE,  n_it) ; 


%  Test  of  vector  quantization 
figured) 

subplot  (4,2,1)  ;plot  {l:8,cstal(l,  :),'*M:8,w(l) ),  title  (  [  'vectorl,Class= 
'  num2str (classsta(l) )  '  M='  num2str(M)]) 

subplot (4, 2,2) ;plot (l:8,cstal(2, : ) , ' * M : 8,w(2) ), title ([ 'vector2 , Class= 
'  nuin2str  (classsta(2)  )  ]  ) 

subplot (4, 2,3) ;plot(l:8,cstal(3, : ) , ' * M : 8 , w(3 ) ), title {[ 'vector3 , Class= 
'  ni;iin2str  (classsta(3)  )  ]  ) 

subplot  (4,2,4)  ;plot  (l:8,cstal(4,  :),'*M:8,w(4))  ,  title  (  [ '  vector4,  Class= 
'  nuin2str  (classsta  (4)  )  ]  ) 

subplot (4,2,5) ;plot (l:8,cstal(5, : ) , ' * M : 8 , w(5) ) , title ( [ 'vectors , Class= 
'  num2str (classsta (5) ) ] ) 

subplot(4,2,6) ;plot (l:8,cstal (6, : ) , ' * ' , 1 : 8, w(6) ), title ( [ 'vector6,Class= 
'  nuin2str  (classsta (6)  )  ]  ) 

subplot  (4,2,7)  ;plot  (l:8,cstal  (7,  1:8, w(7))  ,  title  (  [ '  vector7 ,  Class= 

'  nuin2str(classsta(7) )  ] ) 

subplot (4,2,8) ;plot (l:8,cstal(8, :),'*', 1 : 8 ,w(8) ), title ([ 'vector8 , Class= 
'  nuin2str  (classsta  (8)  )  ]  ) 
figure (2) 

subplot (4,2,1) ;plot (l:8,cmicrol(l, : ) , ' * ' , 1 : 8 , w (classmicro ( 1) , : ) ) , title ( 

[ 'vectorl,Class='  num2str  (classmicro (1)  )  '  M='  nuin2str  (M)  ].) 

subplot (4,2,2) ;plot (1:8, cmicrol (2 ,:),'*', 1 : 8 , w(classmicro (2) , : ) ) , title ( 

[ 'vector2 , Class= '  num2str (classmicro (2 ) ) ] ) 

subplot (4,2,3) ;plot (1:8, cmicrol (3, : ) , 1 : 8 ,w (classmicro (3 ) , : ) ) , title ( 

[ 'vector3 , Class= '  num2str (classmicro (3 ) ) ] ) 

subplot (4,2,4) ;plot (1:8, cmicrol (4, : ) , ' * ' , 1 : 8 , w(classmicro (4 ) , : ) ) , title ( 

[ 'vector4,Class='  num2str (classmicro (4) ) ] ) 

subplot  (4,2,5)  /-plot  (1:8,  cmicrol  (5,  :)  ,  '*',1;  8 ,  w(classmicro  (5)  ,  : )  )  ,  title  ( 

[ 'vector5,Class='  num2str (classmicro (5) ) ] ) 

subplot (4,2,6) ;plot (1:8, cmicrol (6, :) , '*',1; 8 , w(classmicro ( 6) , : ) ) , title ( 

[ 'vectors , Class= '  num2str (classmicro (6) ) ] ) 

subplot (4,2,7) ;plot (l:8,cmicrol(7, :),'*' , 1 : 8 , w(classmicro (7 ) , :) ) ,title( 

[ 'vector7 , Class= '  num2str (classmicro (7 ) ) ] ) 

subplot (4,2,8) ;plot (1:8, cmicrol (8,:),'*',!: 8 , w (classmicro ( 8 ) , : ) ) , title ( 

[ 'vector 8 , Class= '  num2str (classmicro (8) ) ] ) 

%qa(l:T, : ) =w(classmicro (1 :T) , : ) ; 

%qa(l :T, : ) =w (classsta ( 1 : T) , : ) ; 
for  n=l:T 

distances (n) =  dist (cstal (n, : ) , w(classsta (n) , : ) ' ) ; 
distancem(n) =  dist (cmicrol (n, : ) , w(classmicro (n) ,:)'); 


end 

figure (3 ) 

stem(distances) , title { 'Euclidean  Distance  original/quantized  vectors') 
figure (4) 

stem(distancem) , title { 'Euclidean  Distance  original/quantized  vectors') 
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APPENDIX  A3.  FEATURES  EXTRACTION  (LPC  +  ENERGY)  FUNCTION 


%  Filename:  coeff.m 
%  Written  by:  M.  Zambartas 
%  Date  Last  Modified  10  August  1999 

%  Purpose:  fuction  that  computes  the  LPC  coefficients  and  the  energy  of 
each  segment 

%  T:  total  number  of  segments 
%  c:  Coefficients  vector 

function  [c,T] =coeff (word) ; 

%  signals  interpolation 
%word=word -mean  (word)  ; 

1= length (word) ; 

%  interpolation  of  word,  so  every  word  has  10000  data  points 
int=interpl (1 : 1, word, 1 : 1/10000 : 10000)  ; 

%  Pre-emphasis  filtering: 
pre=filter ( [1 , - . 98] , 1, int) ; 

start=l; 
step=sl200  ; 
l=10000-step; 

%  Overlaping  of  segments: 
overlap=round ( (10/100) *step) ; 

%#  of  Observations  T: 

T=l/step; 

T=floor (T) ; 
for  n=l:T 

c (n, 1 : 8 ) =ar_covar (pre ( start : start+step-l+overlap) ,7); 
start=start+step; 

x=xcorr (pre (start: start+step-l+overlap) ) ; 
lx=length(x) ; 
center=find(x==max(x) ) ; 
c(n,l)=x (center) ; 

end 

C  (  :  ,  1)  =C  (  :  ,  1)  /max(c  (:,!)); 
c(:,l)=c(:,l)-.5; 
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APPENDIX  A4.  FORWARD  VARIABLE  ALGORITHM  IMPLEMENTATION 


%  Filename:  forward. m 
%  Written  by:  M.  Zambartas 
%  Date  Last  Modified  10  August  1999 

%  Purpose:  Implements  the  forward  algorithm  (Rabiner  1986  pg9) 

fiinction  afw=f oirward (a, b, pi , 0, N, T) 

%  0:  the  observation  sequence 

%  T:  total  number  of  segments 

%  a:  the  state  transition  probabilities  (N  x  N) 

%  b:  observation  probability  distribution  (N  x  M) 

%  pi:  initial  state  distribution  (N  x  1) 

%  N:  total  number  of  states 

%  M:  total  nuimber  of  possible  observation  symbols  (dimension  of 

codebook) 

%  afw:  the  forward  variable  (N  x  T) 

%  bbw  the  backward  variable  (N  x  T) 

%  initial  values; 
afw=zeros (T,N) ; 

afw(l,l:N)=pi(l:N) . *b{l :N,0(1) ) ' ; 

%  recursion: 

%  we  are  using  eps  (for  zero  results)  to  avoid  divided  by  zero  problems 
if  norm{afw(l, : ) ) <eps 
afw  ( 1 ,  : ) =eps ; 

end 

%  FW  algorithm: 
for  t=l:T-l 
for  j=l:N 

S=0; 

for  i=l:N 
%  suuranation  S: 

S=S+af w (t,i)*a(i,j) ; 
end 

%logafw=loglO (S) +loglO (b( j ,0(t+l) ) ) ; 

afw(t+l, j ) =S*b( j ,0(t+l) ) ; 
if  norm(afw( t+1, j ) ) <eps 
afw(t+l, j ) =eps; 
end 

%afw{t+l/  j )  =10'"logafw; 

end 

end 
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APPENDIX  A5.  BACKWARD  VARIABLE  ALGORITHM  IMPLEMENTATION 


%  Filename:  backward. m 
%  Written  by:  M.  Zambartas 
%  Date  Last  Modified  10  August  1999 

%  Purpose:  Implements  backward  algorithm  (Rabiner  1986  pg9) 

function  bbw=backward (a, b, pi , 0, N, T) 

%  0:  the  observation  sequence 

%  T:  total  number  of  segments 

%  a:  the  state  transition  probabilities  (N  x  N) 

%  b:  observation  probability  distribution  (N  x  M) 

%  pi:  initial  state  distribution  (N  x  1) 

%  N:  total  number  of  states 

%  M:  total  number  of  possible  observation  symbols  (dimension  of 

codebook) 

%  afw:  the  forward  variable  (N  x  T) 

%  bbw  the  backward  variable  (N  x  T) 

%  BW  Algorithm: 

%  Initial  Values : 

bbw=zeros (T,N) ; 
bbw{T,l:N)=l; 

%  recursion: 

%  we  are  using  eps  (for  zero  results)  to  avoid  divided  by  zero  problems 
if  norm(bbw{T,  : )  ) <eps 
bbw{T, : ) =eps; 

end 


for  i=l:N 
S=0; 

for  j=l:N 

S=S+a (i, j ) *b{ j , 0 (t+1) ) *bbw(t+l, j ) ; 

end 

bbw ( t , i ) =S ; 
if  S<eps 

bbw{t, i) =eps; 

end 


end 

end 
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APPENDIX  A6.  FW/BW  VARIABLES  SCALING 


%  Filename:  scale. m 
%  Written  by:  M.  Zambartas 
%  Date  Last  Modified  10  August  1999 

%  Purpose:  Implements  scaling  of  FW  and  BW  variables,  according  to 
Rabiner(1989  pg  272) 

%  We  perform  scaling  to  avoid  exceeding  the  precision  range  of  the 
computer 

%  0:  the  observation  sequence 

%  T:  total  number  of  segments 

%  a:  the  state  transition  probabilities  (N  x  N) 

%  b:  observation  probability  distribution  (N  x  M) 

%  pi:  initial  state  distribution  (N  x  1) 

%  N:  total  niomber  of  states 

%  M:  total  number  of  possible  observation  symbols  (dimension  of 

codebook) 

%  afw:  the  forward  variable  (N  x  T) 

%  bbw  the  backward  variable  (N  x  T) 

function  [afwl , bbwl , c] =scale (a, b, pi ,0); 

N=length{a) ; 
s=size (O) ; 

T=s(2); 

afw=forward(a,b,pi,0,N,T)  ; 
bbw=backward(a,b,pi,0,N,T) ; 
afw2=zeros (T,N) ; 
afw2 ( 1 , : ) =afw (1,:); 

c (1) =l/sum(afw(l,  ;)); 
afwl(l,l:N)=c(l) *afw(l,l:N)  ; 

for  t=2:T 
for  i=l:N 
S=0; 

for  j=l:N 
%  summation  S: 

S=S+afwl(t-l, j)*a(j,i) ; 
end 

afw2(t,i)=S*b(i,0(t) ) ; 


end 

su=sum(afw2  (t,  : )  )  ; 
if  su==0 
su=eps ; 

end 

c (t) =l/su; 

afwl ( t , : ) =c ( t ) *afw2 ( t , : ) ; 
end 

bbwl=zeros (T,N) ; 
bbwKT,  :  )=c(T)  ; 
for  t=T-l:-l:l 
for  i=l:N 
S=0; 
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for  j=l:N 

S=S+a(i, j)*b(j,0(t+l) )*bbwl(t+l, j) 

end 

bbwl (t, i) =S*c (t) ; 
end 

end 
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APPENDIX  A7.  MULTIPLE  OBSERVATION  SEQUENCE 


%  Filename :  newmodel  .m 
%  Written  by:  M.  Zambartas 
%  Date  Last  Modified  10  August  1999 

%  Purpose:  Builds  the  0_multi  matrix,  which  contains  all  trials 
obsrvations  and 
%  calls  trainmulti.m  function 
%  T:  total  number  of  segments 

%  a:  the  state  transition  probabilities  (N  x  N) 

%  b:  observation  probability  distribution  (N  x  M) 

%  pi:  initial  state  distribution  (N  x  1) 

%  N:  total  nimber  of  states 

%  M:  total  number  of  possible  observation  symbols  (dimension  of 
codebook) 


%  models  re-estimation 
%  initial  conditions: 


0_multi= [classmicro; classmicro2 ; classmicroS ] ; 
%0_multi= [classsta; classsta2 ; classstaS ] ; 
%0_multi= [classinstr ; classinstr2 ; classinstr3 ] ; 
%0_multi= [classprof ; classprof 2 ; classprof 3 ] ; 

[a,b,pi]  =  trainmulti  (0_multi,N,T,M)  ; 
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APPENDIX  A8.  BAUM-WELCH  ALGORITHM  (MODEL  REESTIMATION) 


%  Filename:  trainmodel.m 
%  Written  by:  M.  Zambartas 
%  Date  Last  Modified  10  August  1999 

%  Purpose:  this  function  implements  the  Baum-Welch  reetimation 
algorithm  as  discribed 
%  in  Rabiner  (1986)  pg  11 

%  0:  the  observation  sequence 

%  T:  total  number  of  segments 

%  a:  the  state  transition  probabilities  (N  x  N) 

%  b:  observation  probability  distribution  (N  x  M) 

%  pi:  initial  state  distribution  (N  x  1) 

%  N:  total  number  of  states 

%  M:  total  number  of  possible  observation  symbols  (dimension  of 

codebook) 

%  afw:  the  forward  variable  (N  x  T) 

%  bbw  the  backward  variable  (N  x  T) 

%  afwl:  Scaled  forward  variable  (N  x  T) 

%  bbwl:  Scaled  backward  variable  (N  x  T) 

function  [a,b,pi,P]  =  trainmodel (0/N,T,M) 

%  if  the  model  is  ergotic  keep  a=rand(N) .  If  left-right  then 
a=triu(a),  because  it  has 

%  to  be  upper  triangular,  in  order  to  converge  to  an  upper  triangular 
matrix 

%  Initial  estimation  of  model  lamda= (a,b,pi) : 

%a=rand(N)  ; 
b=rand(N,M)  ; 
a=triu(a)  ; 

%  we  need  the  sximmation  or-  all  rows  of  a  and  b  to  be  1  ,  so: 
b  =  b. / ( (sum(b. ' ) ) • ' *ones (1,M) ) ; 
a  =  a. / ( (sum(a. ' ) ) . ' *ones (1,N) ) ; 
for  j=l:N; 

pi(j)=l/(50*j)  ; 
end 

%  we  need  the  summation  of  pi  to  be  1: 
pi=pi. /sum(pi) ; 
for  times=l:20 

afw=forward(a,b,pi,0,N,T)  ; 
bbw=backward(a,b,pi,0,N,T)  ; 

%  Scaling:  (Rabiner  89  pg272) 

%  afw  scaling: 
a_old=a; 
b_old=b; 
pi_old=pi ; 


afw2=zeros (T,N) ; 
af w2 ( 1 , : ) =af w ( 1 , : ) ; 

c  (1) =l/sum(afw(l, :)); 
afwl(l,l:N)=c(l) *afw(l,l:N) ; 
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for  t=2:T 
for  i=l:N 
S=0; 

for  j=l:N 
%  siommation  S: 

S=S+afwl ( t-1 / j ) *a ( j , i ) ; 
end 

afw2 { t , i ) =S*b(i , 0 ( t ) ) ; 


end 

su=suin(afw2  (t,  ; )  )  ; 

%  to  avoid  'divided  by  zero'  errors: 
if  su==0 
su=eps ; 

end 

c (t) =l/su; 

af wl { t , : ) =c ( t ) *af w2 ( t , : ) ; 
end 

%  bbw  scaling: 
bbwl=2eros (T,N) ; 
bbwKT,  :  )=c(T)  ; 
for  t=T-l:-l:l 
for  i=l:N 
S=0; 

for  j=l:N 

S=S+a ( i / j ) *b ( j , 0 ( t+1 ) ) *bbwl ( t+1 , j ) ; 

end 

bbwl (t , i) =S*c ( t) ; 
end 

end 

%  a  reestimation 
for  i=l:N 
S2=:[]; 
for  t=l:T-l 
Sl=[]; 
for  j=l:N 

Sl= [SI;  afwl ( t , i) *a (i , j ) *bbwl (t+1, j ) *b( j , 0(t+l) ) ] 

end 

S2=[S2  sum(Sl) ] ; 
end 

dena  ( i )  =s\m  ( sum  ( S2 )  )  ; 
end 


for  i=l:N 
for  j=l:N 

S1=0;S2=0; 
for  t=l:T-l 

Sl=  [SI;  afwKt,  i)  *a  (i,  j  )  *bbwl  (t+1,  j  )  *b(  j  ,0(t+l)  )  ]  ; 
end 

nxama  ( i ,  j  )  =sum  ( SI )  ; 
if  dena(i)==0 
dena(i) =eps; 

end 

a ( i , j ) =numa ( i , j ) /dena ( i ) ; 
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end 

end 

if  norm{a-a_old) < . 01 

a=a_old; 

end 

%  b  reestimation 
for  j=l:N 
for  k=l:M 
S1=0;S2=0; 
for  t=l:T 

Sl=Sl+afwl (t , j ) *bbwl (t , j ) * (0 (t) ==k) ; ; 

S2=S2+afwl (t, j ) *bbwl (t , j ) ; 
if  S2==0 
S2=eps; 

end 

b(j,k)=Sl/S2; 

end 

end 

end 

if  norm(b-b_old)  < .  01 

b=b_old; 

end 

%  P_bw  ( O I  lamda )  pr obabi  1  i ty 
logP=“Siam(loglO(c)); 

P=10'"logP; 

pi  ( : )  =  (afwl  (1,  : )  .  *bbw(l,  : )  /c  (1)  )  /P; 

%logpi=(loglO(afwl(l,i) )+loglO(bbw(l,i) )-logl0(c(l) ) )-logP; 
%pi  (i)  =10'"logpi; 
if  norm(pi-pi_old) < . 01 


pi=pi_old; 

end 

%fprintf('%g  %g  %g  \n' ,  norm  (a-a_old)  ,  norm  (b-b_old)  ,  norm  (pi -pi_old) ) 
if  norm(  (a-a_old)  ==0)  &  (norm(b-b_old)  ==0)  &  (norm(pi-pi_old)  ==0) 
break 
end 
end 
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APPENDIX  A9.  BADM-WELCH  ALGORITHM  (MODEL  REESTIMATION) 
WITH  MULTIPLE  OBSERVATIONS 


%  Filename:  trainmulti.m 
%  Written  by:  M.  Zambartas 
%  Date  Last  Modified  10  August  1999 

%  Purpose:  this  function  (re) estimates  the  model  {a,b,lamda)  for 
multiple  Observation 

%  sequence  according  to  Rabiner  (89)  pg  273 
%  Each  row  of  0__multi  is  each  Observation  0,  where: 

%  O:  the  observation  sequence 

%  T:  total  number  of  segments 

%  a:  the  state  transition  probabilities  (N  x  N) 

%  b:  observation  probability  distribution  (N  x  M) 

%  pi:  initial  state  distribution  (N  x  1) 

%  N:  total  number  of  states 

%  M:  total  number  of  possible  observation  symbols  (dimension  of 

codebook) 

%  afw:  the  forward  variable  (N  x  T) 

%  bbw  the  backward  variable  (N  x  T) 

function  [a,b,pi]  =  trainmulti (0_multi,N, T,M) 

%  s=size (0_multi) ; 
trials=s (1) ; 

afwl_multi=  [  ]  ;  bbwl__multi=  [  ]  ; 

%  Computing  P(0,lamda)  &  scaled  fw  and  bw  probabilities  for  each  Obs 
(trial) 

for  k=l: trials 

[a,b,pi,P]  =  trainmodel  (0__multi  (k,  : )  ,N,T,M)  ; 

P_multi (k) =P; 

[afwl,bbwl, c] =scale (a,b,pi, 0_multi (k, : ) ) ; 
afwl_multi= [afwl_multi  afwl] ; 
bbwl__multi=  [bbwl_multi  bbwl]  ; 

end 

%  Model's  re-estimation  using  an  alternative  formula: 

%P_multi(l)=l; 

%a_old=a; 

%  for  i=l:N 
%  for  j=l:N 
%  Slk= [ ] ; S2k= [  ]  ; 

%  for  k=l: trials 

%  S1=[];S2=[]; 

%  for  t=l:T-l 

%  S1=[S1  afwl_multi (t , i+ (k- 

1)  *N)  *a__old(i,  j  )  *b(  j  ,  0_multi  (k,  t+1)  )  *bbwl__multi  (t+1,  j+  (k-1)  *N)  ]  ; 

%  S2= [ S2  afwl^mul ti ( t , i+ ( k-1 ) *N) *bbwl_multi ( t , i+ (k-l ) *N) ) ; 

%  end 

%  Sl=sum(Sl) /P_multi (k) ; 

%  S2=sum(S2) /P_multi (k) ; 

%  Slk=[SlkSl]; 

%  S2k=[S2k  S2]  ; 

%  end 

%  Slk=sum ( Slk)  ;  S2k=sum ( S2k)  ; 

%  a(i, j)=Slk/S2k; 

%  end 
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%end 


%  a  denominator  evaluation: 
for  i=l:N 
Sk=[]; 

for  ]c=l:  trials 
S2=[]; 
for  t=l:T-l 
Sl=[]; 
for  j=l:N 

S1=[S1;  afwl_multi(t,i+{k-l) *N) *a (i, j ) *bbwl_multi (t+1, j+ (k- 
1)*N)  *b(j,0_multi{k,t+l) ) ] ; 
end 

S2=[S2  sum(Sl) ] ; 
end 

Sk=[Sk  S2/P_multi(k) ] ; 
end 

dena ( i ) =sum ( Sk) ; 
end 

%  a  final  Re-estimation 
for  i=l:N 
for  j=l:N 
Sk=[]; 

for  k=l: trials 
Sl=[]; 
for  t=l:T-l 

Sl= [SI  afwl_multi (t , i+ (k-1) *N) *a (i, j ) *bbwl_multi (t+l, j+ (k- 
1)*N) *b(j,0_multi(k,t+l) ) ] ; 

end 

Sk=[Sk  sum(Sl)/P_multi(k)]; 


end 

a(i,  j  )  =siam(Sk)  /dena{i)  ; 
end 

end 

%  b  reestimation: 
for  j=l:N 

for  1=1  :M 

Slk=[];S2k=[]; 
for  k=l: trials 
S1=[];S2=[]; 
for  t=l:T 

Sl= [SI  afwl_multi (t, j+ (k-1) *N) *bbwl_multi { t , j+ (k- 
1)  *N)  *  (0__multi  (k,  t)  ==1)  ]  ; 

.  S2=[S2  afwl_multi (t, j+ (k-1) *N) *bbwl_multi (t, j+ (k-1) *N) ] 


end 

Sl=sum(Sl) /P_multi(k) ; S2=sum(S2 ) /P_multi (k) ; 

Slk=[Slk  Sl];S2k=[S2k  S2] ; 

end 

Slk=sum(Slk)  ;  S2k=s\im(S2k)  ; 
b(j,l)=Slk/S2k; 
end 

end 


87 


APPENDIX  AlO.  VITERBI  ALGORITHM 


%  Filename:  viterbi.m 
%  Written  by:  M.  Zambartas 
%  Date  Last  Modified  10  August  1999 

%  Purpose:  function  that  implements  the  Viterbi  algorithm  as  discribed 
in 

%  Rabiner  (1986)  pg  11 

%  0:  the  observation  sequence  of  testing  word 

%  T:  total  number  of  segments 

%  a:  the  state  transition  probabilities  (N  x  N) 

%  b:  observation  probability  distribution  (N  x  M) 

%  pi:  initial  state  distribution  (N  x  1) 

%  N:  total  number  of  states 

%  M:  total  number  of  possible  observation  symbols  (dimension  of 

codebook) 

%  afw:  the  forward  variable  (N  x  T) 

%  bbw  the  backward  variable  (N  x  T) 


function  [f i,mi, s_seq] =viterbi (a,b,pi, 0,N,T) 
%  Viterbi  algorithm: 

%  initial  value  f il (i) =pi (i) *bi (01) 
fi=zeros (T,N) ; 

fi(l,l:N)=pi(l:N) .*b(l:N,0(l)) 

%  backtracking  pointer  mi : 
mi=zeros (1,N) ; 
for  t=2:T 
for  j=l:N 
M=  [  ]  ; 
for  i=l:N 

M=[M  fi (t-1, i) *a(i, j ) ] ; 
end 

ArgMax=find(M==max(M)  )  ; 
fi  (t,  j  )  =max(M)  .*b(j,0(t)); 
mi ( t , j ) =ArgMax ( 1 ) ; 
end 

end 

%  Path  (state  sequence)  backtracking: 
s__seq=zeros  (1,T)  ; 

%  final  state  @  T  time: 
u=find(fi (T, : ) ==max(fi (T, : ) ) ) ; 
s_seq{T) =u(l) ; 
for  t=T-l:-l:l 

s_seq(t) =mi (t+1, s_seq(t+l) )  ; 

end 
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APPENDIX  All.  SCORING  (CLASSIFICATION) 


%  Filename:  score. m 
%  Written  by:  M.  Zambartas 
%  Date  Last  Modified  10  August  1999 

%  Purpose:  Evaluates  P__bw(0 1  lamda)  and  P_viterbi  (0|  lamda) 


% 

0: 

the  obseirvation  sequence  of  tested  word 

% 

T: 

total  number  of  segments 

% 

a: 

the  state  transition  probabilities  (N  x 

N) 

% 

b: 

observation  probability  distribution 

(N 

X  M) 

% 

pi: 

initial  state  distribution  (N  x  1) 

% 

N: 

total  number  of  states 

% 

M: 

total  number  of  possible  observation 

symbols  (dimension  of 

codebook) 

% 

afw: 

the  forward  variable  (N  x  T) 

% 

bbw 

the  backward  variable  (N  x  T) 

function  [P_bw, P_v] =score (a,b,pi, 0) 

%  Scoring  of  an  Observation 
N=length(a) ; 
s=size (b) ; 

T=length(0) ; 

%  Viterbi  probability: 

[f i,mi, s_seg]  =viterbi (a,b,pi, 0,N,T)  ; 
P_v=max{fi  (T,  : )  )  ; 

%  Baum-Welch  probability: 
afw=forward(a,b,pi,0/N,T) ; 
bbw=backward(a,b,pi,0,N,T)  ; 

[ af wl , bbwl ,c]=scale(a,b,pi,0)  ; 

%  Rabiner  89  pg  273  P (o, lamda) =l/prod(c) 
sc=-sum(loglO (c) ) ; 

P_bw=l/prod(c) ; 
if  P_bw==NaN 
P-_bw=0; 

end 
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APPENDIX  A12.  CLASSIFICATION  RESULTS  PLOTS 


%  Filename:  scoretest.m 
%  Written  by:  M.  Zambartas 
%  Date  Last  Modified  10  August  1999 

%  Purpose:  Plots  classification  results  for  all  words,  evaluates 
P__bw  ( 0 1  lamda )  and  P_vi terbi  ( 0 1  lamda ) 

%  for  every  model 

%  0:  the  observation  sequence  of  tested  word 

%  T:  total  number  of  segments 

%  a:  the  state  transition  probabilities  (N  x  N) 

%  b:  observation  probability  distribution  (N  x  M) 

%  pi:  initial  state  distribution  (N  x  1) 

%  N:  total  number  of  states 

%  M:  total  ntimber  of  possible  observation  symbols  (dimension  of 

codebook) 

%  afw:  the  forward  variable  (N  x  T) 

%  bbw  the  backward  variable  (N  x  T) 

0=classstatest ; 

' test :  statistics ' 

[P_bwl, P_vl] =score (a,b,pi, 0) 
if  (P_bwl==0) 

P_bwl=eps ; 

end 

if  (P_vl==0) 

P_vl=eps; 

end 

O=classmicrotest ; 

'test:  microsoft' 

[P__bw2,P_v2]  =  score  (a,b,pi,0) 
if  {P_bw2==0) 

P_bw2=eps; 

end 

if  (P_v2==0) 

P_v2=eps; 

end 

0=classinstrtest ; 

' test :  instructor ' 

[P_bw3,P_v3] =score(a,b,pi,0) 
if  (P_bw3==0) 

.P_bw3=eps; 

end 

if  (P_v3==0) 
p_v3=eps; 

end 


O=classprof test ; 

'test:  professor' 

[P_bw4,  P_jv4]  =score  (a,b,pi,0) 
if  (P_bw4==0) 

P_bw4=eps ; 

end 
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if  {P_v4==0) 

P— v4=eps; 

end 

subplot (2,1,1) ,  barh(10*logl0 ( [P_bwl  P_bw2  P_bw3  P_bw4])); 
title ([ 'Baum-Welch  Probability  -  1 : Statistics,  2:Microsoft, 
3:Instructor  4:Professor,  M='  nuin2str(M)  N='  num2str (N)  ] )  , 
xlabel { ' P_B_W[dB] ' ) ,  ylabel ( ' tested  word' ) 
grid 

subplot  (2, 1, 2)  ,  barh{10*logl0  ( [P_vl  P_v2  P__v3  P_y4])); 

title ([ 'Viterbi  Probability  -  l:Statistics,  2:Microsoft,  3:Instructor 

4; Professor,  M='  num2str(M)  ',  N='  nuin2str  (N)  ] )  , 

xlabel  ( '  P__V_i_t_e_r__b_i  [dB]  ' )  ,  ylabel  ( '  tested  word' ) 

grid 
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APPENDIX  A13,  VECTOR  QUANTIZATION  ALGORITHM 

function  [CODE,  label,  dist]  =  svq(X,  lev,  n_it) ; 

%svq  Vector  quantization  using  successive  binary  splitting  steps. 
%  Use:  [CODE, label, dist]  =  svq(X, lev,n_it) . 

%  The  final  codebook  dimension  lev  should  be  a  power  of  two.  dist 
%  returns  the  distorsion  values  at  the  end  of  intermediate  step. 

%  n_it  is  the  number  of  iterations  performed  in  each  step. 

%  Version  1.3 

%  Olivier  Cappi,  28/09/94  -  04/03/97 
%  ENST  Dpt.  Signal  /  CNRS  URA  820,  Paris 

%  Needed  functions 
if  (exist ('vq')  2) 

error ( 'Function  vq  is  missing. ' ) ; 
end; 

%  Turn  verbose  mode  off 
QUIET  =  1; 

%  Input  agruments 

error ( nargchk ( 3 ,  3 ,  nargin ) ) ; 

%  Dimension  of  imput  data 
[n,p]  =  size(X); 

%  Number  of  spliting  steps 
nbs  =  round (log2 (lev) ) ; 
lev  =  2'^nbs; 

%  Fixed  perturbation 
perturb  =  0.01; 

%  Initialize  first  centroid  with  global  mean 
CODE  =  zeros (lev,  p) ; 

CODE_  =  zeros (lev,  p); 

CODEd,:)  =  mean(X); 
label  =  ones(n,l); 

for  i=l:nbs 

%  1.  Codebook  splitting 
for  j=l:  (2'^{i-l)) 

CODE_(2*j-l, : )  =  (1+perturb)  *  CODE(j,:); 

CODE_(2*j,:)  =  (l-perturb)  *CODE(j,:); 

end; 

%  2.  K-means  optimization 

[CODE(l:2^i, :) ,label,vdist]  =  vq(X,CODE_(l :2^i, :)  ,n_it, QUIET) ; 
dist(i)  =  vdist(n_it); 

fprintfd,  'Codebook  size  %d: \t%  .  3f  \n' ,  2^i, dist  (i)  )  ; 
end; 
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APPENDIX  A14  VECTOR  QUANTIZATION  USING  LGB  ALGORITHM 


function  [CODE_n,  label,  dist]  =  vq(X,  CODE,  n_it,  QUIET) ; 

%vq  Vector  quantization  using  the  K-means  (or  LBG)  algorithm. 

%  Use:  [CODE_n, label, dist]  =  vq (X,CODE, n_it) 

%  Performs  n_it  iterations  of  the  K-means  algorithm  on  X,  using 
%  CODE  as  initial  codebook. 

%  Version  1.3 

%  Olivier  Cappi,  28/09/94  ~  16/07/97 
%  ENST  Dpt.  Signal  /  CNRS  URA  820,  Paris 

error ( nargchk ( 3 ,  4 ,  nargin ) ) ; 
if  (nargin  <  4) 

QUIET  =  0; 
end; 

%  Dimensions  of  X 
[n,p]  =  size(X); 

%  Codebook  size 
m  =  length (C0DE( :, 1) ) ; 

%  Initialialize  label  array 
label  =  zeros ( 1, n); 

%  As  well  as  distortion  values 
dist  =  zeros {l,n_it) ; 

%  Main  loop 
CODE_n  =  CODE; 
for  iter  =  l:n_it 

%  1.  Find  nearest  neighbor  for  the  squared  distortion 
DIST  =  zeros (m,n); 
if  (P  >  1) 
for  i  =  l:m 

DIST{i,:)  =  svm({(X  -  ones (n, 1) *C0DE_n{i, :))'). ^2) ; 
end; 
else 

%  Beware  of  siom  when  p  =  1  ( ! ) 

DIST  =  (ones  (m,  1)  *X'  -  CODE__n*ones  (l,n) )  ,  ""2  ; 
end; 

[vm, label]  =  min(DIST); 

%  Mean  distortion 
dist(iter)  =  mean(vm); 

%  2 .  Update  the  codebook 
n_out  =  0; 
for  i  =  l:m 
ind  =  (1 :n) ; 

ind  =  ind ({label  ==  i) ) ; 
if  (length (ind)  ==  0) 

%  Isolated  centroid  are  not  modified 
n_out  =  n_out  +  1; 
elseif  (length (ind)  ==  1) 

%  When  there  is  only  one  nearest  neighbor  for  a  given  codebook 

entry 

C0DE_n(i,:)  =X(ind,:); 
else 
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CODE__n  { i ,  : )  =  mean  (X  ( ind,  : )  )  ; 
end; 
end; 

%  Affichage 
if  ('-QUIET) 

%  fprintf (1, ' Iteration  %d: \t% . 3f \n' , iter, dist (iter) ) ; 
end; 

if  (n_out  >  0) 

%  fprintf (!,'  Warning  :  %.0f  isolated  centroids \n ' ,n_out) 
end; 
end; 
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APPENDIX  B.  SEISMO-ACOUSTIC  SONAR  PROJECT  INFORMATION 


This  appendix  contains  some  basic  information  regarding  the  NPS  seismo- 
acoustic  sonar  project.  Further  details  may  be  found  in  [10, 11, 12]. 
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Actuator  with  waterproof  case  and  coupling  device  [12] 


! 

i 
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APPENDIX  B2.  BEACH  TEST  SITE 


Beach  test  site  with  data  collection  equipment  [12] 
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APPENDIX  B3.  MINE-LIKE  OBJECTS 


Gas  cylinder  target  [12] 
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APPENDIX  B4.  BURIED  MINE-LIKE  OBJECTS 


Amplitude  Amplitude  Amplitude  Amplitude 


APPENDIX  B5.  8-CHANNEL  DATA  PLOT  [12] 


Channel  1 


Channel  2 


Range[fl] 


Rangefft] 


8-Channel  data  plot  of  the  received  signals.  Channels  1  and  2  come  from  the 
accelerator  meters.  Channels  3, 4, 5  represent  the  x,  y,  z  motion  of  the  1**  geophone, 
and  channels  6, 7, 8  the  x,  y,  z  motion  of  the  2“^  geophone. 
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APPENDIX  B6.  TEST  SETUP  FOR  HELIUM  GAS  TANK  AND  GUNPOWDER 
KEG  TARGET  TESTS  (INCREASING  MASS)  [12] 

This  appendix  contains  the  experimental  setup  for  target  detection  with  increasing 
mass  tests  conducted  on  November  6, 1998  and  November  10, 1998. 


Source  #1 

Orientation;  Vertical 
Voltage:  lOV  (20Vpk-pk) 

Frequency:  80Hz 

Cycles;  1 

Geophone  #1 
Filter:  High  Pass  40Hz 
Gain:  40db 


Source  #2 

Orientation:  Vertical 
Voltage:  lOV  (20Vpk-pk) 

Frequency:  80Hz 

Cycles:  1 

Geophone  #2 
Filter:  High  Pass  40Hz 
Gain:  40db 


Comments:  Overcast,  med-high  tide,  4-5ft  waves. 


APPENDIX  B7.  Rayleigh  wave  [11] 


Seismic  wavetrain  resulting  from  a  single  vertical  impulse  source  [11] . 
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APPENDIX  B8.  GEOPHONE  [11] 


APPENDIX  B9.  CROSS  POWER  [12] 


Geo  #1  Real  Power  ~>  tgt4lbsum  Geo  HZ  Real  Power 


Real  and  imaginary  cross  power  components  of  the  received  signal  for  both 
geophones.  Target  located  at  22  feet. 


I 
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APPENDIX  BIO.  TARGETS  STRENTH 


Target  Strength  Vs  Target  Mass 


Target  strength  vs.  target  mass  for  powder  keg  target  [12]. 
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APPENDIX  BIO.  IMAGINARY  CROSS  POWER  SIGNAL  FOR  THE  CYLINDER 

TARGET 


Imaginary  power  plot  for  cylinder,  November  6*^  experiment,  5  weight  types  [12]. 
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APPENDIX  Bll.  IMAGINARY  CROSS  POWER  SIGNAL  FOR  THE  POWDEP 

KEG  TARGET 
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Imaginary  power  plot  for  powder  keg,  November  10*  experiment,  six  data  plots. 
Note  that  the  2“'^  geophone  did  not  receive  the  target  [12]. 
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APPENDIX  C.  MATLAB  CODE;  HMM  BASED  CLASSIFIER  FOR  MINE 

RECOGNITION 


This  Appendix  contains  the  various  MATLAB  files  used  for  recognizing  the 
mine-like  objects  using  HMMs.  MATLAB  files  written  in  [11]  used  for  preprocessing 
the  data  are  also  included  for  completeness. 
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APPENDIX  Cl.  COMPUTATION  OF  THE  CROSS  POWER  SIGNAL  FROM 
RAW  SIGNAL;  8  CHANNEL;  SIGNAL.  CODE  WRITTEN  BY  M. 

FITZPATRICK  [12] 


%  Name: 

%  Author: 

%  Updated: 

%  Description: 


Presents .m 

LT  Mike  Fitzpatrick 

9/31/98 

This  program  conducts  a  Hilbert  analysis.... 


%%%Input  Parameters%%% 
Range=l ; 
wavespd=295; 
t_start=0.050; 
t_stop=0 .150; 
r_start=18; 
r_stop=28 ; 
scale=0 . 0230; 
georange=10; 


%Turns-on  plotting  with  range  axis 
%Wavespeed  [ft/s] 

%Set  start  time  [s] 

%Set  stop  time  [s] 

%Set  start  range  [ft] 

%Set  stop  range  [ft] 

%Set  axis  scaling  (Set  to  0  turns-off  scaling) 
%Enter  range  to  geophone  [ft] 


%%%Date,  Directory,  &  File%%% 
date='Nov_6'  ; 
directory=' Targets ' ; 
cd  (date),cd  (directory) 


COUNT=0;  %Set  start  count 

while (1) 

clc,disp( ' ***Hilbert  Analysis  Subroutine*** '), dir  *.mat; 
COUNT=COUNT+l 

if  COUNT==l, load  tgtOlbsum,  end 
if  COUNT==2 , load  tgt41bsum,  end 
if  COUNT==3 , load  tgt81bsum,  end 
if  COUNT==4,  load  tgtl21bs\im,  end 
if  COUNT==5,  load  tgtl61bs\am,  end 
if  COUNT==6,break,end 
[M,N) =size (channel) ; 
transform=hilbert (channel) ; 

Pwr( : , 1, COUNT) =conj (transform( : ,3) ) .*transform( : ,5) ; 

Pwr  ( :  ,2, COUNT)  =conj  (transfo2rm( : ,  6)  )  .  *transfo2rm( : ,  8)  ; 

end 

end 


%%%Set  Axes%%% 
if  Range==l 

[maxi , indexl ] =max ( abs ( channel (:,!))); 

[max2 , index2 ] =max ( abs ( channel ( : , 2 ) ) )  ; 
index=round( ( indexl +index2) /2) ; 

start=round( ( ( indexl +index2) 12)  + (r_start/ (dt*wavespd) ) ) ; 
stop=round( ( ( indexl +index2) /2) + (r_stop/ (dt*wavespd) ) ) ; 
range=wavespd* ( t (start : stop) -t (index) ) ; 
%start=round(delay+ (restart/ (dt*wavespd) ) ) ; 
%stop=round(delay+  (r_stop/  (dt*wavespd)  )  )  ; 

% range =wavespd* ( t (start : stop) -t (delay) ) ; 
else 

start=round( (t_start/dt) +1) ; 
stop=round( ( t_stop/dt) +1) ; 


110 


end 


if  scale==0 

Realinax=l.l*max(max(abs  (real  (Pwr  (start  :stop,  :))))); 
Imaginax=l .  1  *max  (max  ( abs  ( imag  ( Pwr  ( start :  stop ,:))))); 
else 

Realmax=  scale;  Imagmax= scale; 

end 

for  n=l: COUNT- 1 

Pwrl{ ; ,n)=imag(Pwr (start:stop,l,n) ) ; 

Pwr2  ( : , n)  =imag( Pwr  (start :  stop,  2, n)  )  ; 

end 

%%%Plotting%%% 
figure, orient  portrait 
if  Range==l 

subplot (2, 1,1) 

plot (range, Pwrl ( : , 1) , 'c' , 'LineWidth' ,1) ,hold 
plot (range, Pwrl ( : , 2) , 'm' , 'LineWidth' ,2) 
plot (range, Pwrl ( : ,3) , 'g' , 'LineWidth' ,3) 
plot (range, Pwrl ( : , 4) , 'b' , 'LineWidth' ,4) 
plot (range, Pwrl ( : , 5) , 'r ' , 'LineWidth' , 5) 
axis  (  [min  ( range )  max  ( range )  -Imagmax  Imagmax] )  ,  hold 
title ( 'Vector  Polarization  Filter  (Geophone  #1)') 
xlabel ( 'Range  [ft] ') ,ylabel ( 'Amplitude' ) ,grid 
legend { '1561bs' , '1601bs', '3641bs', '4681bs', '5721bs') 
%%%%%%%%%%%%% 
subplot (2, 1, 2) 

plot (range, Pwr2 ( : , 1) , 'c' , 'LineWidth' , 1) ,hold 

plot (range, Pwr2 ( : ,2) , 'm' , 'LineWidth' ,2) 

plot (range, Pwr2 ( : ,3) , 'g' , 'LineWidth' ,3) 

plot (range, Pwr2 ( : , 4) , 'b' , 'LineWidth' ,4) 

plot (range, Pwr2 ( : , 5) , 'r ' , 'LineWidth' , 5) 

axis  ( [min  (range)  max  (range)  -Imagmax  Imagmax]  ),  hold 

title ( 'Vector  Polarization  Filter  (Geophone  #2)') 

xlabel (' Range  [ft] ') , y label ( 'Amplitude' ) ,grid 

legend (' 15 61bs', '1601bs', '3641bs', '4681bs', '5721bs') 

end 

cd  .  • 
cd  .  . 
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APPENDIX  C2.  DATA  SELECTION  USED  FOR  THE  MULTIPLE  WEIGHT 
SIGNAL  EXPERIMENT  SET-UP.  DATA  USED  FOR  HMM  TRAINING 


%  Filename :  kegcyl .m 
%  Written  by:  M.  Zambartas 
%  Date  Last  Modified  10  August  1999 

%  Purpose:  Generates  and  plots  the  signals  for  the  multiple  weight  HMMs 
training 

%  kegmat.mat:  imaginary  cross  power  of  the  keg  signal 

%  cylmat.mat:  imaginary  cross  power  of  the  cylinder  signal 

%  back. mat:  imaginary  cross  power  of  a  non  target  signal 

load  kegmat 
load  cylmat 
load  back; 

range=linspace (1, 60,3000) ; 

%  interpolation  to  3000  points  of  all  vectors,  in  order  all  signals  to 
have  the  same 
%  length 

keg4321b(2277, : ) = [ ] ;keg5361b (2277 , : )=[] ; cyl5721b{2277 , :)=[]; 
sk=size (keg2241b) ; sc=size (cyl3641b) ; sb=size (back) ; 
kegint=zeros  (3000, 6)  ;cyllint=zeros  (3000, 6)  ;backint:=zeros  (3000, 6)  ; 

%  6  trials  per  signal: 
for  n=l:6 

keg2241bint ( : , n) = (interpl (1 : sk(l) , keg2241b( : ,n) , 1 : sk(l) /3001 : sk(l) ) ) ' ; 
keg4321bint ( : ,n) = (interpl (1 : sk(l) , keg4321b( : ,n) , 1 : sk(l) /3001 : sk(l) ) ) ' ; 
keg5361bint ( : ,n) = (interpl (1 : sk(l) , keg5361b( : ,n) , 1 :sk(l) /3001 : sk(l) ) ) ' ; 


cyl3641bint ( : , n) = (interpl (1 : sc (1) , cyl3641b ( : ,n) , 1 : sc (1) /3001 : sc (1) ) ) ' ; 

cyl4681bint ( : , n) = (interpl (1 : sc (1) , cyl4681b( : ,n) , 1 : sc (1) /3001 : sc (1) ) ) ' ; 

cyl5721bint ( : ,n) = (interpl (1 : sc (1) , cyl5721b{ : ,n) , 1 : sc (1) /3001 : sc (1) ) ) ' ; 

backint ( : ,n) = (interpl (1 : sb(l) ,back( : ,n) , 1 : sb (1) /3001 : sb(l) ) ) ' ; 
end 

%  plot  of  all  signals 
subplot (3,2,1) 

plot (range, keg2 24 Ibint ( : ,1) , 'c' , 'LineWidth' ,1) ,hold 
plot (range, keg2 2 4 Ibint ( : , 2 ) , 'm' , 'LineWidth' , 1) 
plot (range, keg2 2 4 Ibint ( : , 3) , 'g' , 'LineWidth' ,1) 
plot (range, keg2 2 4 Ibint ( : , 4) , 'b' , 'LineWidth' ,1) 
plot (range, keg2 2 4 Ibint ( : , 5) , 'k' , 'LineWidth' , 1) 
plot (range, keg2 2 4 Ibint ( : , 6) , ' r ' , 'LineWidth' , 1) 
hold 

title ( 'Powder  Keg  2241b  ') 

axis([0  60  -0.03  0.03]),grid; 
subplot (3,2,3) 
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plot (range, keg4321bint { : , 1) , 'c' , 'LineWidth' ,1) ,hold 
plot (range, keg4321bint ( : , 2 ) , 'm' , 'LineWidth' , 1) 
plot (range, keg4321bint ( : , 3) , 'g' , 'LineWidth' ,1) 
plot (range,keg4321bint ( : , 4) , 'b' , 'LineWidth' ,1) 
plot (range, keg4321bint ( : , 5) , 'k' , 'LineWidth' ,1) 
plot (range, keg4 32 Ibint ( : , 6) , 'r ' , 'LineWidth' ,1) 
axis([0  60  -0.03  0 . 03] ) ,grid; 
title( 'Powder  Keg  4321b  ') 

hold 

subplot (3,2,5) 

plot (range, kegS 3 6 Ibint ( : , 1) , 'c ' , 'LineWidth' ,1) ,hold 
plot (range, kegS 3 6 Ibint ( : , 2 ) , 'm' , 'LineWidth' ,1) 
plot (range, keg5 3 6 Ibint ( : , 3) , 'g' , 'LineWidth' ,1) 
plot (range, kegS 3 6 Ibint ( : , 4) , 'b' , 'LineWidth' ,1) 
plot (range, kegS 3 6 Ibint ( : , 5) , 'k' , 'LineWidth' ,1) 
plot (range, kegS 3 6 Ibint ( : , 6) , 'r ' , 'LineWidth' ,1) 
axis([0  60  -0.03  0. 03]), grid; 
title( 'Powder  Keg  5361b  ') 

hold 

subplot (3, 2,2) 

plot ( range , cyl3  641bint ( : , 1 ) , ' c ' , ' LineWidth ' , 1 ) , hold 
plot (range, cyl3 64 Ibint ( : , 2) , 'm' , 'LineWidth' ,1) 
plot (range, cyl3 64 Ibint ( : , 3 ) , 'g' , 'LineWidth' , 1) 
plot (range, cyl3641bint ( : ,4) , 'b' , 'LineWidth' ,1) 
plot (range, cyl3 64 Ibint ( : , 5) , 'k' , 'LineWidth' ,1) 
plot (range, cyl3 64 Ibint ( : , 6) , 'r ' , 'LineWidth' ,1) 
axis([0  60  -0.02  0 . 02] ) , grid; 

title ( 'Cylinder  3641b  ') 
hold 

subplot (3,2,4) 

plot (range, cyl4 6 8 Ibint ( : , 1) , 'c' , 'LineWidth' ,1) ,hold 
plot (range, cyl4 6 8 Ibint ( : , 2 ) , 'm' , 'LineWidth' , 1) 
plot (range, cy 14 6 8 Ibint ( : , 3) , 'g' , 'LineWidth' ,1) 
plot (range, cyl4681bint ( : , 4) , 'b' , 'LineWidth' , 1) 
plot (range, cyl4 6 8 Ibint ( : , 5) , 'k' , 'LineWidth' , 1) 
plot (range, cy 14 6 8 Ibint ( : , 6) , 'r ' , 'LineWidth' , 1) 
axis([0  60  -0.02  0.02]),grid; 

title ( 'Cylinder  4681b  ') 
hold 

subplot (3,2,6) 

plot  (range, cy 15 72 Ibint  ( :  ,1)  ,  'c' ,  'LineWidth'  ,1)  ,hold 
plot (range, cylS 72 Ibint ( : , 2) , 'm' , 'LineWidth' ,1) 
plot (range, cylS 72 Ibint ( : , 3) , 'g' , 'LineWidth' ,1) 
plot (range, cyl 5 72 Ibint ( : ,4) , 'b' , 'LineWidth' ,1) 
plot (range, cylS 7 2 Ibint ( : , 5) , 'k' , 'LineWidth' , 1) 
plot (range, cylS 72 Ibint ( : , 6) , 'r ' , 'LineWidth' , 1) 
axis([0  60  -0.02  0. 02]), grid; 

title ( 'Cylinder  5721b  ') 
hold 

%  target  location: 
for  n=l : 6 

ik224(n)=find 

(keg2241bint (1000 ; 1800,n) ==inax{keg2241bint (1000 : 1800, n) ) ) 
ik432 (n) =f ind 

(keg4321bint (1000 : 1800,n) ==max (keg4 32 Ibint (1000 : 1800,n) ) ) 
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ik536 (n) =f ind 

(keg5361bint (1000: 1800, n) ==max(keg5361bint (1000:1800, n) ) ) ; 
ik224(n)=ik224(n) +1000-100; 
ik432 (n) =ik432 (n) +1000-100; 
ik536 (n) =ik536 (n) +1000-100; 

end 

ik224=ik224(l) ; ik432=ik432 (1) ; ik536=ik536 (1) ; 

for  n=l:6 

ic364 (n) =find 

(cyl3641bint  (1000:1800, n)  ==inax (cyl3641bint  (1000 : 1800 ,n)  )  )  ; 
ic364(n) =ic364 (n) +1000-100 ; 
ic468 (n) ind 

(cyl4681bint ( 1000 : 1800, n)^=max(cyl4681bint( 1000: 1800, n) ) ) ; 
ic468(n)=ic468 (n) +1000-100 ; 

ic572 (n) =f ind 

(cyl5721bint (1000:1800, n) ==max (cyl5721bint (1000:1800, n) ) ) ; 
ic572 (n)=ic572 (n) +1000-100; 


end 

ic364=ic364(l) ; ic468=ic468 (1) ;ic572=ic572 (6) ; 

%  so  now  we  know  where  the  target  is  located,  we  can  build  the  signal 
file 

%  signal (data, mine, trial) 

%  mine(l-3)  is  for  the  keg  (6,8,10  ft)'  s  target,  mine(4)  is  for  the 
cylinder'  s  target 

%  and  mine (5-10)  background  and  other  non  target  data 
signalmulti=zeros (400,10,6)  ; 
for  trials=l:6 

signalmulti ( : , 1 , trials) =keg2241bint (ik224 : ik224+399 , trials) ; 
signalmulti ( : , 2 , trials) =keg4321bint (ik432 : ik432+399, trials) ; 
signalmulti ( : , 3 , trials) =keg5361bint (ik536 : ik536+399 , trials) ; 
signalmulti ( : , 4, trials) =cyl3641bint (ic364 : ic3 64+399 , trials) ; 

signalmulti ( : , 5, trials) =cyl4681bint (ic468 : ic468+399, trials) ; 
signalmulti ( : , 6 , trials) =cyl5721bint (ic572 : ic572+399 , trials) ; 

signalmulti ( : , 7, trials) =backint (ic364 : ic364+399, trials) ; 
signalmulti ( : , 8, trials) =backint (ik536 : ik536+399, trials) ; 
signalmulti ( : , 9 , trials) =backint (2001 : 2400 , trials) ; 
signalmulti ( : , 10 , trials) =backint (1500:1899, trials) ; 
signalmulti ( : , 11, trials) =backint (800 : 800+399, trials) ; 
signalmulti ( : , 12 , trials) =backint (2500:2899, trials) ; 


end 

%signalmulti ( : , 3 , : ) =signalmulti ( : , 3 , : ) /2 ; 
for  m=l:4 
for  n=l : 6 

signalmulti  ( :  ,m,n)  =signalmulti  ( :  ,m,n)  -mean (signalmulti  ( :  ,m,n)  )  ; 

end 

end 

cd  .  . 

save  signalmulti  signalmulti 
cd  data 
figure (2) 
range=l:40*0; 

%range=linspace (21, 29 , 400) ; 
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subplot (3,2,1) 

plot (range, signalmulti (:,l,l),'c', 'LineWidth' ,1) ,hold 
plot (range, signalmulti {:,l,2),'m', 'LineWidth' , 1) 
plot (range, signalmulti( : , 1, 3) , 'g' , 'LineWidth' , 1) 
plot (range, signalmulti (;,l,4),'b', 'LineWidth' , 1) 
plot (range, signalmulti (:,l,5),'k', 'LineWidth' , 1) 
plot (range, signalmulti (:,l,6),'r', 'LineWidth' , 1) 
title (' Powder  Keg  2241b  ') 

hold 

subplot (3,2,3) 

plot (range, signalmulti (:,2,l),'c', 'LineWidth' , 1) , hold 
plot (range, signalmulti (:,2,2),'m', 'LineWidth' , 1) 
plot (range, signalmulti (:,2,3),'g', 'LineWidth' ,1) 
plot (range, signalmulti (:,2,4),'b', 'LineWidth' , 1) 
plot (range, signalmulti (;,2,5),'k', 'LineWidth' , 1) 
plot (range, signalmulti (:,2,6),'r', 'LineWidth' , 1) 
title ( 'Powder  Keg  4321b  ') 
hold 

subplot (3, 2, 5) 

plot (range, signalmulti (:,3,1) , 'c', 'LineWidth' , 1) ,hold 
plot (range, signalmulti ( : , 3 , 2) , 'm' , 'LineWidth' , 1) 
plot (range, signalmulti (:,3,3),'g', 'LineWidth' , 1) 
plot (range, signalmulti (:,3,4),'b', 'LineWidth' , 1) 
plot (range, signalmulti (:,3,5),'k',' LineWidth' , 1) 
plot (range, signalmulti (:,3,6),'r', 'LineWidth' , 1) 
title ( 'Powder  Keg  5361b  ') 

%axis([0  400  “0.02  0.02]) 
xlabel  ( ' Data  Points ' ) 
hold 

subplot (3,2,2) 

plot (range, signalmulti (:,4,1), 'c', 'LineWidth' , 1) , hold 
plot (range, signalmulti ( : , 4, 2 ) , 'm' , 'LineWidth' , 1) 
plot (range, signalmulti (:,4,3), 'g', 'LineWidth' , 1) 
plot (range, signalmulti (:,4,4), 'b', 'LineWidth' , 1) 
plot (range, signalmulti (:,4,5),'k', 'LineWidth' , 1) 
plot (range, signalmulti ( ; , 4, 6) , 'r' , 'LineWidth' , 1) 
title ( 'Cylinder  3641b  ') 
hold 

subplot (3,2,4) 

plot (range, signalmulti { : , 5, 1) , 'c' , 'LineWidth' , 1) ,bold 
plot (range, signalmulti (:,5,2) , 'm', 'LineWidth' , 1) 
plot (range, signalmulti {:,5,3),'g', 'LineWidth' ,1) 
plot (range, signalmulti (:,5,4),'b', 'LineWidth' , 1) 
plot (range, signalmulti ( : , 5, 5) , 'k' , 'LineWidth' , 1) 
plot (range, signalmulti (:,5,6) , 'r', 'LineWidth' , 1) 

title ( 'Cylinder  4681b  ') 

hold 

subplot (3,2,6) 

plot (range , signalmulti (:,6,1) , 'c', ' LineWidth ' ,  1 )  ,  hold 
plot (range, signalmulti ( : , 6 , 2 ) , 'm' , 'LineWidth' , 1) 
plot (range, signalmulti (:,6,3),'g', 'LineWidth' , 1) 
plot (range , signalmulti (:,6,4) , 'b', ' LineWidth' , 1) 
plot (range, signalmulti (:,6,5),'k', 'LineWidth' , 1) 
plot (range, signalmulti ( : , 6, 6) , 'r' > 'LineWidth' , 1) 
title ( 'Cylinder  5721b  ') 
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xlabel  ('Data  Points') 

hold 

cd  .  . 

save  signal  signalmulti 
cd  dat 


APPENDIX  C3,  FEATURE  ANALYSIS  FOR  MINE  LIKE  OBJECT  SIGNALS; 
MULTIPLE  WEIGHTS  EXPERIMENT  SET-UP 


%  Filename :  trainingkegcyl . m 
%  Written  by:  M.  Zambartas 
%  Date  Last  Modified  10  August  1999 

%  Purpose:  Features  Analysis  for  mine-like  object  signals  with 
multiple  weights 

%  (3  sets  of  weights  for  the  keg  and  the  cylinder  -  6  trials 

each,  and  7  background  signals) ,  for 

%  HMM  recognition  use.  We  will  create  two  classes,  one  for 

every  mine-like  object. 

%  Vector  quantization  using  k-means 

%  M:  #  of  symbols 
%  N:  #  of  states 
%  mines:  #  total  #  of  mines 

%  c:  #  coefficients  matrix  for  all  mines 

%  w:  #  VQ  coeff  matrix 


%  vector  quantization  with  multiple  lb  (keg, cylinder, background) 
clear 

load  signal 
signal=signalmulti ; 
clear  signalmulti; 

M=8;N=4; 

rand ( ' seed ' ,  1 )  ; 

n_it=20; 

%segs=4; 
mines=12 ; 


s=size{ signal) ;sl=s{l)  ; 


%seg=fix(sl/segs) ; 

%Coef ficients  evaluation 
%c=zeros (T, 4 , 8 , segs , 6 )  ; 
for  mine=l: mines 
for  trial=l : 6 

[c( : , : , mine, trial) , T] =coeffinterp (signal ( : , mine, trial) ) ; 

end 

end 


%  vector  quantization 
pl=[]; 
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mine, trial) ] ; 


for  mine=l: mines 
for  trial=l:6 
pl= [pi  ;c ( : , : , 
end 
end 

[CODE,  label,  dist]  =  svq(pl,  M,  n_it) ; 

class=zeros (T, mines, 5) ; 
w=zeros (T, 8, mines, 5) ; 
for  mine=l:mines 
for  trial=l:6 

[w  class ( : ,mine, trial) ,  dist] =vq(c ( : ,  : , mine,  trial) ,CODE,n_it) ; 
end 
end 

%  Test  of  vector  quantization 
%figure(l) 

%subplot(4,2,l) ;plot {l:8,c(l, : ,1,1) , , 1 : 8 , w (class { 1 , 1 , 1) ,:,!,!)) ,titl 
e(  [ 'vectorl,Class='  nxim2str  (class  (1, 1, 1) )  '  M='  num2str(M)]) 

%subplot (4,2,2) ;plot (l:8,c(l, : ,1,2) , ' * ' , 1 : 8 ,  w(class (1, 1, 2) , : ,1,2) ) , titl 
e ( [ 'vector2 , Class= '  num2str (class (1, 1, 1) )  '  M='  num2str(M)]) 

%qa(l:T, : ) =w (classmicro ( 1 : T) , : ) ; 

%qa (1 :T, : ) =w (classsta (1 :T) , : ) ; 
distances^ [ ] ; 

%for  n=l:segs 
%  for  trial=l:5 
%  distances=  [distances 

dist (c ( : , : ,n, trial) ,w (class ( : ,n, trial) , : ,n, trial) ' ) ] ; 

%end 


%end 

%figure(3) 

%stem(distances) , title ( 'Euclidean  Distance  original /quantized  vectors') 
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APPENDIX  C4.  SIGNAL  DECIMATION 


%  Filename:  coeffinterp.m 
%  Written  by:  M.  Zambartas 
%  Date  Last  Modified  10  August  1999 

%  Purpose:  fuction  that  performs  a  decimation  of  segment  and  normalizes 
it. 

%  T:  total  number  of  segments 
%  c:  Coefficients  vector 

function  [c,T] =coeffinterp (word) ; 
word=word-mean  (word)  ; 

%int=interpl (1 : 1, word, 1:1/10000:10000) ; 

%pre=filter ( [1, - . 98] , 1, word) ; 
pre=word; 

%  harming  filter: 

%han=hanning ( 3 ) ; 

%pre=conv(han,pre) ; 
l=:length(word)  ; 

%start=l; 

step=l/2; 

%1=2000; 

%  0%  overlap 

overlap=ro\ind (  (0/100)  *step)  ; 

%#  of  Observations  T: 

T=l/step+ (l*overlap/step""2)  ; 

T=floor(T); 
for  n=l:T 

c  (n,  1 : 10)  =interpl  (1 :  step,pre  (1+step*  (n-1)  -overlap*  (n-1)  :step*n- 
overlap* (n-1)) ,l:step/10:step) ; 
end 

%c  ( : ,  1)  =c  ( : ,  1)  /max(c  (  :,!)); 

%c(:,l)=c(:,l)-.5; 

%  normalization: 
for  n=l:T 

c (n, : ) =c (n, : ) . /max(abs (c (n, : ) ) ) ; 

end 
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APPENDIX  C5.  HMM  TRAINING  AND  SCORING  FOR  MULTIPLE 
WEIGHTS  MINE  LIKE  SIGNAL  DATA 


%  Filename :  sequence .  m 
%  Written  by:  M.  Zambartas 
%  Date  Last  Modified  10  August  1999 

%  Purpose:  Performs  all  the  sequence  of  HMM  training  and  scoring  the 
signals  from  multiple  lbs  mines 

%  rotating  every  time  the  tested  signalkegmat .mat :  imaginary 

cross  power  of  the  keg  signal 

%  cylmat.mat:  imaginary  cross  power  of  the  cylinder  signal 
%  back. mat:  imaginary  cross  power  of  a  non  target  signal 

%  testing  sequence 
seq=[2  3456 
1  3  4  5  6 
1  2  4  5  6 
1  2  3  5  6 
1  2  3  4  6 
12345]; 
for  test=l:6; 

ts=seq(test , : ) ; 

%  newmodelkegcyl  trains  the  HMM  using  the  seq  sequence  of  signals 
newmodelkegcyl 

%  scretestkegcyl  scores  the  HMM  for  every  testing  signals 

scoretestkegcyl 

P_bkeg224 (test) =P_bwk224; 

P_vkeg224  (test)  =P__vk224; 

P_bkeg432 ( test) =P_bwk432 ; 

P_vkeg432 (test) =P_vk432 ; 

P_bkeg536 (test) =P_bwk536; 

P_vkeg536 (test) =P_vk536 ; 

P_bcyl364  (test)  =P_bwc364; 

P__vcyl364  (test)  =P_vc364; 

P_bcyl468  (test)  =P_bwc468; 

P_vcyl468  (test)  =P__vc468; 

P_bcyl572  (test)  =P_bwc572 ; 

P_vcyl572 (test)=P_vc572; 

P__bback  ( t  e  s  t )  =  P_bwb  ; 

P„vback ( tes t ) =P_vb ; 

end 
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APPENDIX  C6.  GENERATION  OF  HMM  OBSERVATION  SEQUENCE  FOR 
MULTIPLE  WEIGHTS  MINE  LIKE  SIGNAL  DATA 


%  Filename :  newmode Ikegcyl . m 
%  Written  by:  M.  Zambartas 
%  Date  Last  Modified  10  August  1999 

%  Purpose:  Creates  the  multiple  observation  matrix  0_multi  for  multiple 
observations 

%  HMM  training  of  multi  lbs  testing 

%  ts:  traininf  sequence,  assigned  at  sequence. m  file 
%  training  sequence: 

%  keg  (224,432,5361b) : 

%0_multi= [class ( : , 1, ts (1) ) ' ; class { : , 1, ts (2) ) ' ; class { ; , 1, ts (3) ) ' ; class ( : 
, 1, ts (4) ) ' ; class ( : , 1, ts (5) ) ' ; class { : , 2 , ts (1) ) ' ; class { : , 2 , ts (2) ) ' ; class ( 
: ,2, ts (3) ) ' ; class { : , 2, ts (4) ) ' ; class { : , 2, ts (5) ) ' ; class ( : ,3, ts (1) ) ' ; class 
{ : ,3, ts (2) ) ' ; class ( : ,3 , ts (3) ) ' ; class ( : , 3, ts (4) ) ' ; class ( : ,3, ts (5) ) ' ] ; 

%  cylinder  {364,468,5721b) 

0_multi= [class ( : , 4, ts (1) ) ' ; class ( : , 4, ts (2) ) ' ; class { : , 4, ts (3) ) ' ; class ( : , 
4, ts (4) ) ' ; class ( : , 4 , ts (5) ) ' ; class ( : , 5 , ts (1) ) ' ; class ( : , 5, ts (2 ) ) ' ; class { : 
, 5, ts (3) ) ' ; class ( : , 5 , ts (4) ) ' ; class ( : , 5 , ts (5) ) ' ; class ( : , 6 , ts (1) ) ' ; class ( 
: , 6, ts (2) ) ' /class ( : , 6, ts (3) ) ' /class { : , 6, ts (4) ) ' /class ( : , 6, ts (5) ) '] ; 

[a,b,pi]  =  trainmulti (0_multi,N,T,M) / 
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APPENDIX  Cl.  SCORING  FOR  THE  MULTIPLE  WEIGHT  MINE  LIKE 

SIGNAL  SET-UP 


%  Filename iscorekegcyl.m 
%  Written  by:  M.  Zambartas 
%  Date  Last  Modified  10  August  1999 

%  Purpose:  Scoring  of  all  testing  mine-like  signals-multiple  lbs  case 


'test:  keg2241b' 

0=class (: ,1, test) ; 

[P_bwk224, P_vk224] =score (a,b,pi , 0' ) 
if  (P_bwk224==:0) 

P_bwk224=eps; 

end 

if  (P_vk224==0) 

P_vk224=eps; 

end 

'test:  keg4321b' 

0=class ( : , 2 , test ) ; 

[P_bwk432 , P_vk432] =score (a,b,pi,0' ) 
if  (P_bwk432==0) 

P_bwk432=eps; 

end 

if  (P_vk432==0) 

P_vk432=eps; 

end 

'test:  keg5361b' 

0=class ( : , 3 , test) ; 

[P_bwk536, P_vk536] =score (a,b,pi,0' ) 
if  (P_bwk536==0) 

P_bwk536=eps; 

end 

if  {P_vk536==0) 

P_vk536=eps; 

end 

'test :  cylinder3641b' 

0=class ( : ,4, test) ; 

[P_bwc364, P_vc364] =score (a,b,pi,0' ) 
if  (P_bwc364==0) 

P_bwc364=eps  ; 

end 

if  (P_vc364==0) 

P_vc364=eps ; 

end 

' test :  cylinder4681b' 

0=class ( : , 5 , test ) ; 

[P_bwc468, P_vc468] =score (a, b,pi , 0' ) 
if  (P_bwc468==0) 

P_bwc468=eps; 

end 

if  (P_vc468==0) 
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P_vc468=eps; 

end 

'test :  cylinders 7 2 lb' 

0=class ( : , 6 , test ) ; 

[P_bwc572 , P_vc572] =score (a,b,pi, 0' ) 
if  {P_bwc572==0) 

P_bwc5  7  2  =eps ; 

end 

if  (P_vc572==0) 

P_vc572=eps; 

end 

' test :  background ' 

0=class ( : , 11, test)  ; 

[P_bwb, P_vb] =score (a,b,pi, 0' ) 
if  (P_bwb==0) 

P_bwb=eps ; 

end 

if  (P_vb==0) 

P_vb=eps ; 

end 


% figure 

subplot (6,2, test*2-l) ,  barh(10*logl0 ( [P_bwk224  P_bwk432  P_bwk536 
P_bwc364  P__bwc468  P_bwc572  P_bwb] )  )  ; 
if  test==l 

title(['  Pr[0 1 laitida( cylinder )  ]  -  1  :keg_2_2_4_l_b, 

2  :  keg_4_3_2_l_b ,  3  :  keg_5__3_6_i_b ,  4 :  cy  1  inder_3_6_4_l_b , 

5 :  cylinder_4_6__8_l_b,  6  :  cylinder_5_7_2_l_b,  7  :  background,  M= ' 

nuin2str{M)  N='  nuin2str  (N)  ] ) 

end 

axis  (  [*-50  0  1  7] )  ; 
if  test==6 

xlabel  ( '  P_V_i__t__e_r_b_i  [dB]  ' ) 

end 

if  test==6 

xlabel  ( '  P_B_W [dB]  ' ) 
end 

if  test==3 

ylabel ( ' tested  segment ' ) 
end 

if  test==3 

ylabel ( ' tested  segment ' ) 
end 

grid 

hold 

subplot (6,2, test *2) ,  barh(10*logl0 ( [P_vk224  P_vk432  P_vk536  P_vc364 
P_vc468  P_vc572  P_vb] ) ) ; 

%title ( [ 'Viterbi  Probability  -  l:keg_2241b,  2:keg_4321b,  3:keg_5361b 
4 : cy 1 inder_3  641b,  5 : cy 1 inder_4  681b,  6 : cy 1 inder_5 721b,  7 : background ,  M 
num2str(M)  ',  N='  num2str  (N)  ]  )  ,  xlabel  (' P_V_i_t_e_r„b_i  [dB] ')  , 
ylabel ( ' tested  segment ' ) 
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axis { [-50  0  1  7] ) ; 
grid 

if  test==6 

xlabel  ( '  P_V__i_t_e_r_b_i  [dB]  ' ) 
end 

if  test==3 

ylabel ( ' tested  segment ' ) 
end 
hold 
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APPENDIX  C8.  DATA  SELECTION  FOR  THE  MULTIPLE  DISTANCE  SIGNAL 
EXPERIMENT  SETUP.  DATA  USED  FOR  HMM  TRAINING 


%  Filename:  multift.m 
%  Written  by;  M.  Zambartas 
%  Date  Last  Modified  10  August  1999 

%  Purpose:  Generates  and  plots  the  signals  for  the  multiple  FT  HMMs 
training 

%  kegmat.mat:  imaginary  cross  power  of  the  keg  signal 

%  cylmat.mat:  imaginary  cross  power  of  the  cylinder  signal 

%  back. mat:  imaginary  cross  power  of  a  non  target  signal 


load  multiftkeg 
load  cylSnoe 
load  back; 

range^linspace (1, 60 , 3000) ; 

keg6ft(2276, : )=[] ;kegl0ft (2276, :)=[]; 

sk=size (keg6f t ) ; sc=size (cyl6noe) ; sb=size (back) ; 

kegint=zeros (3000, 6) ; cyllint=zeros (3000, 6) ;backint=zeros (3000, 6) ; 

%  interpolation  to  3000  points 
for  n=l : 6 

keg6ftint ( ; ,n)  =  (interpl (1 : sk (1) ,keg6ft ( : ,n) , 1 ; sk (1) /3001;sk{l) ) )  '  ; 
kegSftint ( : ,n)  =  (interpl (1 : sk(l) ,keg8ft ( : ,n) , 1 : sk(l) /3001 :sk(l) ) )  '  ; 
keglOftint ( : ,n) = (interpl (1 : sk(l) ,kegl0ft( : ,n) ,l:sk(l) /3001:sk(l) ) ) '; 

cyl6noeint ( : ,n)  =  (interpl (1 : sc (1) , cyl6noe ( ; , n)  ,  1 : sc (1) /3001 : sc (1) ) ) ' ; 
backint ( : ,n)  =  (interpl (1 : sb(l) ,back( : ,n) , 1 :sb(l) /3001 : sb(l)  )  )  '  ; 

end 


subplot (4, 1, 1) 

plot (range, keg6f tint ( : , 1) , 'c ' , 'LineWidth' , 1) ,hold 
plot (range, keg6f tint ( : ,2) , 'm' , 'LineWidth' , 1) 
plot (range, keg6f tint ( : , 3) , 'g' , 'LineWidth' , 1) 
plot (range, keg6f tint ( : , 4) , 'b' , 'LineWidth' ,1) 
plot (range, keg6f tint ( ; , 5) , 'k' , 'LineWidth' ,1) 
plot (range, keg6f tint ( ; , 6) , 'r' , 'LineWidth' ,1) 
hold 

title( 'Powder  Keg  6ft  (total;24ft) ' ) 

axis([0  60  -0.02  0. 02]), grid; 
subplot (4,1,2) 

plot (range, kegSftint ( : , 1) , 'c' , 'LineWidth' , 1) ,hold 
plot (range, kegSftint ( : , 2) , 'm' , 'LineWidth' ,1) 
plot (range, kegSftint ( : ,3) , 'g' , 'LineWidth' ,1) 
plot (range, kegSftint ( : ,4) , 'b' , 'LineWidth' ,1) 
plot (range, kegSftint ( : , 5) , 'k' , 'LineWidth' , 1) 
plot (range, kegSftint ( : , 6) , 'r' , 'LineWidth' , 1) 
axis ( [0  60  -0.02  0.02]), grid; 
title( 'Powder  Keg  8ft  (total:22ft) ' ) 

hold 

subplot (4,1,3) 
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plot  (range, keglOf  tint  { : ,  1)  ,  'c' ,  'LineWidth' ,  1)  ,hold 
plot  (range,  keglOf  tint  ( : ,  2)  , 'm' ,  'LineWidth' ,  1) 
plot  (range,  keglOf  tint  ( : ,  3 )  ,  '  g' ,  'LineWidth' ,  1) 
plot (range, keglOf tint ( : , 4) , 'b' , 'LineWidth' , 1) 
plot (range, keglOf tint ( : , 5) , 'k' , 'LineWidth' , 1) 
plot (range, keglOf tint ( : , 6) , 'r ' , 'LineWidth' , 1) 
axis ( [ 0  60  -0.02  0.02]), grid; 
title (' Powder  Keg  10ft  ( total : 20ft )' ) 

hold 

range=l:3000; 
subplot (4,1,4) 

plot (range, cyl6noeint ( : , 1) , ' c ' , 'LineWidth' , 1) , hold 
plo t ( range , cy 1 Gnoeint ( : , 2 ) , ' m ' , ' LineWidth ' , 1 ) 
plot (range, cyl6noeint ( : , 3 ) , 'g' , 'LineWidth' , 1) 
plo t ( range , cyl Gnoeint ( : , 4 ) , ' b ' , ' LineWidth ' , 1 ) 
plot (range, cylGnoeint ( : ,5) , 'k' , 'LineWidth' ,1) 
plot ( range , cyl Gnoeint ( : , 6 ) , ' r ' , ' LineWidth ' , 1 ) 
axis([0  3000  -0.02  0. 02]), grid; 

title ( 'Cylinder  10ft  (total : 20ft) ' ) 
hold 

for  n=l : 6 

ikG (n) =f ind  (kegGf tint (1000:1800, n) ==max (kegGf tint (1000 : 1800, n) )) ; 
ik8 (n) =f ind  (keg8f tint (1000:1800, n) ==max (kegSf tint (1000 : 1800, n) )) ; 
iklO (n) =f ind 

(keglOf tint (1000:1800, n) ==max (keglOf tint (1000 : 1800 ,n) ) ) ; 
ik6(n)=ik6 (n) +1000-100 ; 
ik8(n)=ik8 (n) +1000-100 ; 
ikl0(n)=:ikl0  (n) +1000-100; 

end 

ik6=ik6(l) ;ik8=ik8(l) ; ikl0=ikl0 (1) ; 
for  n=l:6 

iclO (n) =f ind 

(cylGnoeint (1000 : 1800 , n) ==max( cyl Gnoeint (1000 : 1800, n) ) ) ; 
iclO (n)=icl0 (n) +1000-100 ; 

end 

icl0=icl0 (1) -100; 

%  so  now  we  know  where  the  target  is  located,  we  can  build  the  signal 
file 

%  signal (data, mine, trial) 

%  mine(l-3)  is  for  the  keg  (6,8,10  ft)'  s  target,  mine(4)  is  for  the 
cylinder '  s  target 

%  and  mine (5-10)  background  and  other  non  target  data 
signalmulti= zeros (400,10,6)  ; 
for  trials=l:6 

signalmulti ( : , 1, trials) =keg6f tint (ikG : ik6+399, trials) ; 
signalmulti ( : , 2 , trials) =keg8ftint (ik8 : ik8+399, trials) ; 
signalmulti ( : , 3 , trials) =kegl0ftint (iklO : iklO+399 , trials) ; 
signalmulti ( : , 4, trials) =cyl6noeint (iclO :icl0+399, trials) ; 
signalmulti ( : , 5 , trials) -backint (iclO : iclO+399, trials) ; 
signalmulti ( : , 6, trials) =backint (iklO : iklO+399, trials) ; 
signalmulti ( : , 7 , trials) =backint (2001:2400, trials) ; 
signalmulti ( : , 8 , trials) =backint (1500:1899, trials) ; 
signalmulti ( : , 9 , trials) =backint (800 : 800+399, trials) ; 
signalmulti ( : , 10 , trials) =backint (2500:2899, trials) ; 


signalmulti ( : , 3 , : ) =signalmulti ( : , 3 , : ) /3 ; 
for  m=l : 4 
for  n=l:6 

signalmulti ( : ,m,n) =signalmulti ( : ,m, n) -mean (signalmulti ( : ,m, 

end 
end 
cd  .  . 

save  signalmulti  signalmulti 

cd  data 

figure (2) 

range=l:400; 

subplot (4, 1, 1) 

plot (range, signalmulti ( : , 1, 1) , 'c' , 'LineWidth' , 1) , 
plot (range, signalmulti ( : , 1, 2 ) , 'm' , 'LineWidth' , 1) 
plot (range, signalmulti (:,l,3),'g', 'LineWidth' , 1) 
plot (range, signalmulti (:,l,4),'b',' LineWidth' , 1) 
plot (range, signalmulti ( : , 1, 5) , 'k' , 'LineWidth' , 1) 
plot (range, signalmulti (:,l,6),'r', 'LineWidth' , 1) 
hold 

subplot (4, 1, 2 ) 

plot (range, signalmulti (:,2,l),'c',' LineWidth' , 1) , 
plot (range, signalmulti ( : , 2 , 2 ) , 'm' , 'LineWidth' , 1) 
plot (range , signalmulti (:,2,3),'g',' LineWidth ' , 1 ) 
plot (range, signalmulti (:,2,4),'b', 'LineWidth' , 1) 
plot (range, signalmulti (:,2,5),'k', 'LineWidth' , 1) 
plot (range, signalmulti (:,2,6),'r', 'LineWidth' , 1) 
hold 

subplot (4, 1, 3) 

plot (range, signalmulti (:,3,l),'c',' LineWidth' , 1) , 
plot (range, signalmulti ( : , 3 , 2) , 'm' , 'LineWidth' , 1) 
plot (range, signalmulti (:,3,3),'g', 'LineWidth' , 1) 
plot (range, signalmulti (:,3,4),'b', 'LineWidth' , 1) 
plot (range, signalmulti ( : , 3 , 5) , 'k' , 'LineWidth' , 1) 
plot (range, signalmulti (:,3,6),'r', 'LineWidth' , 1) 
hold 

subplot (4, 1,4) 

plot (range, signalmulti (:,4,l),'c', 'LineWidth' , 1) ^ 
plot (range, signalmulti ( : , 4, 2) , 'm' , 'LineWidth' , 1) 
plot (range, signalmulti (:,4,3),'g',' LineWidth' , 1) 
plot (range, signalmulti (:,4,4),'b', 'LineWidth' , 1) 
plot (range, signalmulti (:,4,5),'k', 'LineWidth' , 1) 
plot (range, signalmulti ( : , 4, 6) , 'r' , 'LineWidth' , 1) 
hold 


n) )  ; 


hold 


hold 


hold 


hold 


127 


APPENDIX  C9.  FEATURE  ANALYSIS  FOR  MINE  LIKE  OBJECT  SIGNALS; 
MULTIPLE  DISTANCES  EXPERIMENT  SET-UP 


%  Filename:  trainingkmmulti .m 
%  Written  by:  M.  Zambartas 
%  Date  Last  Modified  10  August  1999 

%  Purpose:  Features  Analysis  for  mine-like  object  signals  with 
multiple  ft 

%  (3  sets  of  ft  for  the  keg  and  one  for  the  cylinder  -  6 

trials  each,  and  6  trials  of  background  signals) ,  for 
%  HMM  recognition  use.  We  will  create  two  classes,  one  for 

every  mine-like  object. 

%  Vector  quantization  using  k-means 

%  M:  #  of  symbols 
%  N:  #  of  states 
%  mines:  #  total  #  of  mines 
%  c:  #  coefficients  matrix  for  all  mines 

%  w:  #  VQ  coeff  matrix 

clear 

load  signalmulti 
signal=signalmulti ; 
clear  signalmulti; 

M=8;N=4; 
rand( ' seed' , 1) ; 
n_it=20 ; 

%segs=4; 
mines =10 ; 


s=size( signal) ;sl=s(l)  ; 


%seg=f ix (sl/segs) ; 

%Coef f icients  evaluation 
%c=zeros (T, 4 , 8, segs, 6)  ; 
for  mine=l:mines 
for  trial=l:6 

[c  (  :  ,  :  ,mine,  trial)  ,T]  =coeffinterp  (signal  { :  ,mine,  trial) )  ; 


end 

end 


%  vector  quantization 
pl=t] ; 

for  mine=l:mines 
for  trial=l:6 

pl= [pi  ; c ( : , : , mine , trial ) ] ; 
end 
end 
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[CODE,  label,  dist]  =  svg(pl,  M,  n_it) ; 

class=zeros (T, mines, 5) ; 
w=zeros (T, 8, mines, 5) ; 
for  mine=l:mines 
for  trial=l:6 

[w  class ( : , mine, trial) ,  dist] =vq(c { : , : , mine, trial) ,CODE,n_it) ; 
end 
end 

%  Test  of  vector  quantization 
%figure(l) 

%subplot (4, 2,1) ;plot (1:8, c{l, :,1,1) , ' ,1 : 8,w(class (1,1,1) , :,1,1)) , titl 
e(  [ 'vectorl,Class='  n\im2str  (class  (1 , 1, 1) )  '  M='  nxim2str  (M)  ] ) 

%subplot (4, 2 , 2 ) ;plot (1 : 8, c (1, :,1,2),'*',1 : 8 , w(class (1, 1,2),:, 1,2)), titl 
e ( [ 'vector2 , Class= '  num2str (class (1, 1, 1) )  '  M='  num2str{M)]) 

%qa(l:T, : ) =w(classmicro (1 :T) , : ) ; 

%qa(l:T, : ) =w (classsta ( 1 : T) , : ) ; 
distances= [ ] ; 

%for  n=l:segs 
%  for  trial=l:5 
%  distances=  [distances 

dist (c ( : , : ,n, trial) , w(class ( : ,n, trial) , : ,n, trial) ' ) ] ; 

%end 


%end 

%figure (3) 

%stem(distances) , title ( 'Euclidean  Distance  original /quantized  vectors') 
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APPENDIX  CIO.  HMM  TRAINING  AND  SCORING  FOR  MULTIPLE 
DISTANCE  SIGNAL  MINE  LIKE  SIGNAL  DATA 


%  Filename :  sequencemulti .m 
%  Written  by:  M.  Zambartas 
%  Date  Last  Modified  10  August  1999 

%  Purpose:  Performs  all  the  sequence  of  HMM  training  and  scoring  the 
signals  from  multiple  ft  mines 

%  rotating  every  time  the  tested  signalkegmat .mat :  imaginary 

cross  power  of  the  keg  signal 

%  cylmat.mat:  imaginary  cross  power  of  the  cylinder  signal 
%  back. mat:  imaginary  cross  power  of  a  non  target  signal 


%  testing  sequence  for  multi  target  distance 
%  rotation  of  the  testing  signal: 
seq=[2  3456 
1  3  4  5  6 
1  2  4  5  6 
1  2  3  5  6 
1  2  3  4  6 
12345]; 
for  test=l:6; 

ts=seq(test; : ) ; 
newmode  Imu  1 1  i 
scoretestmulti 
P_bkeg6ft ( test) =P_bwk6; 

P_vkeg6ft (test) =P_vk6; 

P__bkeg8ft  (test)  =P_bwk8; 

P__vkeg8ft  (test)  =P_vk8  ; 

P_bkeglOft (test) =P_bwklO; 

P_vkeglOft (test) =P_vklO; 

P—bcyllOft (test) =P_bwc; 

P-.vcyllOft  (test)  =P_vc; 

P_bback ( test) =P_bwb; 

P_vback ( test) =P_vb; 


end 
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APPENDIX  Cll.  GENERATION  OF  HMM  OBSERVATION  SEQUENCE  FOR 
MULTIPLE  DISTANCES  MINE  LIKE  SIGNAL  DATA 


%  Filename:  newmode Imul ti .m 
%  Written  by:  M.  Zambartas 
%  Date  Last  Modified  10  August  1999 

%  Purpose;  Creates  the  multiple  observation  matrix  0_multi  for  multiple 
observation 

%  HMM  training  of  multi  ft  testing 

%  ts:  traininf  sequence,  assigned  at  sequence. m  file 
%  training  sequence: 


%  initial  conditions: 

%  class (coefficient (1-T) , segment (1-4) , mine, mass) 

%  keg  (6,8,10ft) : 

0__multi=  [class  (  : ,  1,  ts  (1)  )  ' ; class  ( : ,  1,  ts  (2)  )  ' ; class  ( : ,  1,  ts  (3) )  ' ; class  ( : , 
1, ts (4) ) ' ; class ( : , 1, ts (5) ) ' ; class ( : , 2, ts (1) ) ' ; class ( : , 2, ts (2) ) ' ; class { : 
, 2, ts (3) ) ' ; class ( : , 2, ts (4) ) ' ; class ( : , 2, ts (5) ) ' ; class { : , 3, ts (1) ) ' ; class ( 
: , 3 , t s ( 2 ) ) ' ; class { : , 3 , t s ( 3 ) ) ' ; class ( : , 3 , t s ( 4 ) ) ' ; class ( : , 3 , ts ( 5 ) ) ' ] ; 

%  cylinder  (single  10  ft) 

%0_multi= [class ( : , 4, ts (1) ) ' ; class ( : , 4, ts (2) ) ' ; class ( : , 4, ts (3) ) ' ; class ( : 
,4,ts(4) ) ' ; class ( : ,4,ts{5) ) ']; 

[a,b,pi]  =  trainmulti (0_multi,N,T,M) ; 
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APPENDIX  C12.  TESTING  DATA;  SCORING  FOR  THE  MULTIPLE 
DISTANCES  MINE  LIKE  SIGNAL  SET-UP 


%  Filename:  scoretestmulti .m 
%  Written  by:  M.  Zambartas 
%  Date  Last  Modified  10  August  1999 

%  Purpose:  Scoring  of  all  testing  mine-like  signals-multiple  ft  case 

'test:  keg6ft ' 

0=class ( : , 1, test) ; 

[P_bwk6 , P_vk6] =score (a, b,pi, O' ) 
if  {P_bwk6==0) 

P_bwk6=eps; 

end 

if  {P_vk6==0) 

P_vk6=eps; 

end 

'test:  kegSft' 

0=class ( : , 2 , test) ; 

[ P_bwk8 , P_vk8 ] =score ( a , b , pi , 0 ' ) 
if  (P_bwk8==0) 

P_bwk8=eps; 

end 

if  (P_vk8==0) 

P_vk8=eps; 

end 

' test :  keglOf t ' 

0=class ( : , 3 , test) ; 

[P_bwkl0 , P_vkl0] =score (a,b, pi , O' ) 
if  (P_bwkl0==0) 

P_bwklO=eps  ; 

end 

if  (P__vkl0==0) 

P_vkl0=eps; 

end 

'test:  cylinder' 

0=class ( : , 4 , test ) ; 

[P_bwc, P_vc] = score (a, b, pi , O' ) 
if  (P_bwc==0) 

P_bwc=eps; 

end 

if  {P_vc==0) 

P_vc=eps ; 

end 

' test :  background ' 

0=class ( : , 8, test) ; 

[P_bwb, P_vb] =score (a, b, pi, O' ) 
if  (P_bwb==0) 

P_bwb==eps; 

end 

if  {P_vb==0) 

P_vb=eps; 

end 
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siibplot  (6,2,  test *2-1)  ,  barh(10*logl0  ( [P_bw}c6  P_bwk8  P_bwklO  P_bwc 
P_bwb] ) )  ; 
if  test==l 

title{['  Pr (0| lamda_k_e_g)  -  l:keg_6ft, 

2:keg_8ft,  3:keg_l_0ft,  4:cylinder,  test : background,  M='  nuin2str(M)  S 
N='  nxain2str{N)  ,  ' ,  T=2  ']) 

end 

if  test==6 

xlabel ( ' P_B_W [dB] ' ) 

end 

if  test==3 

ylabel ( 'tested  signal') 
end 

axis { [-50  0  1  5] ) ; 
grid 

subplot (6, 2, test *2) ,  barh (lO^loglO ( [P_vk6  P_vk8  P_vkl0  P_vc  P_vb])); 
%title( [ 'Viterbi  Probability  -  l:keg_6ft,  2:keg_8ft,  3:keg_10ft, 
4:cylinder,  test : background,  M='  nuin2str(M)  ',  N='  ntiin2str (N)  ]  ) 
if  test==6 

xlabel  ( '  P_JV_i_t_e_r_b_i  [dB]  ' ) 
end 

if  test==3 

ylabel ( ' tested  signal ' ) 
end 

axis ( [-50  0  1  5] ) ; 

grid 

hold 
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APPENDIX  D.  MATLAB  CODE;  NEURAL  NETWORK  BASED  CLASSIFIER 

FOR  MINE  RECOGNITION 


This  Appendix  contains  MATLAB  files  used  to  recognize  mine  like  objects  using 
a  supervised  backpropagation  feedforward  neural  network. 
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APPENDIX  Dl.  NN  TRAINING  AND  TESTING  FOR  MULTIPLE  WEIGHTS 
SIGNAL  MINE  LIKE  SIGNAL  DATA 


%  Filename:  nnlbs.m 
%  Written  by:  M.  Zambartas 
%  Date  Last  Modified  10  August  1999 

%  Purpose:  Performs  mine  like  object  signal  classification  -  multiple 
weights  set-up 

%  using  a  supervised  backpropagation  feedforward  neural 

network 

%  ts:  testing  sequence  used  for  the  testing/training  rotation 

%  0_multi_keg:  multiple  observation  of  keg  signals 

%  0_multi_cyl:  multiple  observation  of  cylinder  signals 

%  p:  input  vectors 

%  t:  target  vector;  1  for  keg,  2  for  cylinder 

%  sc:  output  matrix 

seq=[2  3  4  5  6;1  3  4  5  6;1  2  4  5  6;1  2  3  5  6;1  2  3  4  6;1  2  3  4  5] ; 
sc=  [  ]  ; 

for  test=l:6; 


ts=seq(test, : ) ; 


0_multi_keg= [class ( : , 1, ts (1) ) ' ; class ( : , 1, ts (2) ) ' ; class ( : , 1, ts (3) ) ' ;clas 
s  (  : , 1, ts (4) )  ' ; class ( ; , 1, ts (5) ) ' ; class { : , 2 , ts (1) ) ' ; class ( : , 2, ts (2) )  ' ;cla 
ss ( : ,2 , ts (3 ) ) ' ; class ( : , 2 , ts (4) ) ' ; class ( : , 2 , ts (5) ) ' ; class ( : , 3 , ts ( 1 ) ) ' ; d 
ass{ : ,3,ts(2) ) ' ; class ( : , 3 , ts (3 ) ) ' ; class ( : , 3 , ts (4) ) ' ; class ( : , 3 , ts (5) ) '] ' 

t 

0_multi_cyl= [class { : ,4, ts (1) ) ' ; class ( : ,  4, ts (2) ) ' ; class ( : ,4, ts(3) ) ' ;clas 
s(:,4,ts(4)) ' ;class( : ,4,ts(5) ) ' ; class { : , 5 , ts ( 1 ) ) ' /class { : , 5, ts (2) ) '/cla 
ss ( : , 5, ts (3 ) ) ' /class ( : , 5 , ts (4) ) ' / class ( : , 5, ts (5) ) ' /class ( : , 6, ts (1) ) ' /cl 
ass ( : , 6 , ts (2 ) ) ' / class ( : , 6, ts (3 ) ) ' / class { : , 6 , ts (4) ) ' / class ( : , 6 , ts (5) ) ' ] ' 

/ 

%  generation  of  input  vectors  (matrix)  p: 
p=  [0__multi_keg  0„multi_cyl]  ; 

t=[l  1111111111111122222222222222  2]; 

%  #  of  hidden  layers: 60.  #  of  output  layers:!  Functions  used:  logsing, 

purelin  for  hidden 
%  and  output  layer  relatively 

net=newf f ([1  8;  1  8], [60  !],{' logsig'  'purelin' } , ' trainlm' ) ; 
figure (1) 

net .perf onnFcn= ' sse ' ; 

net . trainParam.min_grad=le-20 

net . trainParam. goal=0 . 01 ; 

net . trainParam. show=10 ; 

net . trainParam. epochs=3 00 ; 

[net, tr] =train (net , p, t) ; 

01=class ( : , 1, test) ; 
keg224=sim(net , 01 ) 

02=class ( : , 2 , test) ; 
keg432=sim(net , 02 ) 
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03=class ( : , 3 , test) ; 
keg53  6=sim (net , 03 ) 

04=class ( : , 4, test) ; 
cyl364=sim(net , 04) 

05=class ( : , 5, test) ; 
cyl468=sim(net, 05) 

06=class { : , 6, test) ; 
cyl572=sim{net , 06) 

07=class ( : , 7 , test) ; 
back=sim (net , 07 ) 
sc=[sc; 

keg224,  keg432,  keg536,  cyl364,  cyl468,  cyl572,  back] ; 
figure (2) 

%subplot (7, 1, test) ,  sternd  : 7 , sc) ,  axis([l  7  1  2. 5]), hold 
end 

%  NN  output  plot : 

subplot (3 , 3 , 1) /  plot (1:6,  sc(:,l)),  axis([l  6  0  4]), title 
( ['keg2241bs'] ) ,grid 

subplot (3,3,2) ,  plot (1:6,  sc(:,2)),  axis([l  6  0  4]), title 
( [ 'keg4321bs ' ] ) / grid 

subplot (3,3,3) ,  plot (1:6,  sc(:,3)),  axis([l  6  0  4]), title 
( ['keg5361bs'] ) ,grid 

subplot (3 , 3 , 4) ,  plot (1:6,  sc(:,4)),  axis([l  6  0  4]), title 
{['cyl3641bs']) ,grid 

subplot (3 , 3 , 5) ,  plot (1:6,  sc(:,5)),  axis([l  6  0  4]), title 
( [ ' cyl4681bs ' ] ) / grid 

subplot (3 , 3 , 6) ,  plot (1:6,  sc(:,6)),  axis([l  6  0  4]), title 
(['cyl5721bs']) ,grid 

subplot (3,3,7) ,  plot (1:6,  sc(:,7)),  axis([l  6  0  4]), title 
( [ 'background' ] ) , grid 
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APPENDIX  D2.  NN  TRAINING  AND  TESTING  FOR  MULTIPLE  DISTANCES 

SIGNAL  MINE  LIKE  SIGNAL  DATA 


%  Filename:  nnft.m 
%  Written  by:  M.  Zambartas 
%  Date  Last  Modified  10  August  1999 

%  Purpose:  Performs  mine  like  object  signal  classification  -  multiple 
distances  set-up 

%  using  a  supervised  backpropagation  feedforward  neural 

network 

%  ts:  testing  sequence  used  for  the  testing/training  rotation 

%  0_multi_keg:  multiple  observation  of  keg  signals 

%  0_multi_cyl:  multiple  observation  of  cylinder  signals 

%  p:  input  vectors 

%  t:  target  vector;  1  for  keg,  2  for  cylinder 

%  sc :  output  matrix 

seq=[2  345  6;1  3  4  5  6;1  2  4  5  6;1  235  6;1  234  6;1  2345]; 

SC=[]  ; 

for  test=l:6; 
ts=seq(test , : )  ; 

0_multi_keg=  [class  ( :  ,  1,  ts  (1) )  '  ;class  ( : ,  1,  ts  (2)  )  '  ;class  ( : ,  1,  ts  (3) )  '  ;clas 
s ( : , 1, ts (4) )  ' ; class ( : , 1 , ts ( 5 ) )  ' ; class ( : , 2 , ts (1) )  ' ; class ( : , 2 , tS (2 ) )  '  ^cla 
SS ( : , 2 , ts (3 ) ) ' ; class ( : , 2 , ts (4) ) ' ; class ( : , 2 , ts (5) ) ' ; class ( : , 3 , ts ( 1 ) ) ' ; cl 
ass ( : , 3 , ts (2 ) ) ' ; class ( : , 3 , ts (3 ) ) ' ; class ( : , 3 , ts (4) ) ' /class ( : , 3 , ts (5) ) ' ] ' 
/ 

0_multi_cyl= [class ( : , 4, ts (1) ) ' /class ( : , 4, ts (2) ) ' /class ( : , 4, ts (3) ) ' /clas 
s { : , 4, ts (4) ) ' /class ( : , 4, ts (5) ) ' ] ' / 

%  generation  of  input  vectors  (matrix)  p: 
p=  [0_multi„keg  0__multi_cyl]  ; 

t=[l  111111111111112222  2]; 

%  #  of  hidden  layers: 60.  #  of  output  layers:!  Functions  used:  logsing, 

purelin  for  hidden 
%  and  output  layer  relatively 

net=newf f ([1  8;  1  8],  [60  !],{' logsig'  'purelin' } , ' trainlm' ) ; 
figured) 

net.performFcn='sse' ; 
net . trainParam . min_grad=le-2  0 
net . trainParam. goal=0 . 001/ 
net . trainParam. show=10 ; 
net . trainParam. epochs=3 00 ; 

[net, tr] =train{net ,p, t) ; 

01=class ( : , 1, test) ; 
keg6f t=sim(net , 01) 

02=class ( : , 2 , test) ; 
kegSf t=sim (net , 02 ) 

03=class ( : , 3 , test) ; 
keglOf t=sim(net , 03) 

04=class ( : , 4 , test) ; 
cyllOf t=sim(net, 04) 

05=class ( : , 5 , test )  ; 
back^sim (net , 05 ) 
sc= [sc; 

keg6ft,  kegSft,  keglOft,  cyllOft,back] ; 
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figure (2) 
end 

subplot (3 , 3 , 1) ,  plot (1:6,  sc ( : , 1) , 'o' ) ,  axis([l  6  0  4]), title 
{ [ 'keg6ft' ] ) ,grid 

subplot (3, 3, 2) ,  plot (1:6,  sc(:,2)),  axis([l  6  0  4]), title 
( [ 'kegSft' ] ) , grid 

subplot (3, 3, 3) ,  plot (1:6,  sc(:,3)),  axis([l  6  0  4]), title 
( ['keglOft'] ) ,grid 

subplot(3,3, 4) ,  plot(l:6,  sc(:,4)),  axis([l  6  0  4]), title 
( [ 'cyllOft' ] ) ,grid 

subplot (3 , 3, 5) ,  plot(l:6,  sc(:,5)),  axis([l  6  0  4]), title 
( [ 'background' ] ) , grid 
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